Demonstration Platform Documentation
Interactive, scalable data analysis and visualization with Jupyer Lab
The Demonstration Platform enables users to bring the analytics to the data rather than requiring the data to be duplicated or moved. Jupyter-Lab is a web application that allows live code, equations, visualizations and narrative text to be mixed in a notebook and executed on computational systems that are co-located with the data.
The Jupyer-Lab environment allows interactive data analytics to be executed without worrying about computational resources or installing software.
In order to provide scalable big data analytics, the proof of concept system’s Jupyter-Lab environment replicates what is known as a “Pangeo environment”. Pangeo is a U.S. funded project that is developing and supporting a suite of interconnected software packages that enable scalable geoscience data analytics. The core software packages that make up a Pangeo environment and are showcased in the proof of concept system include:
- Xarray – provides an N-Dimensional Array interface and toolset
- Iris – provides methods for analysing and visualising meteorological and oceanographic datasets
- Dask – provides flexible parallel computing for analytics
- Zarr – the next generation, cloud-native file format for gridded datasets
- Jupyter-Lab – provides the web application framework for interactive analytics
Using these core packages, especially Xarray and Dask, scalable big data analytics can be executed on satellite EO data.