An overview of the MAAP platform

The MAAP is a cloud-based system to write science-analysis code and then run it at scale. This lets you keep all of the input and output data “in the cloud”. It is composed of a few parts:

MAAP Overview Diagram

  • The Algorithm Development Environment (ADE) is a tool that helps with the development of algorithms in a consistent, standardized environment that helps with the development and testing of algorithms and facilitates large scale data processing. MAAP’s primary user interface is Jupyterlab, where code is written and tested before pushed to the large scale data processing system. Code is stored and checked out from Git-based repositories, including Github and MAAP’s own code repository subsystem.

  • The Data Processing System (DPS) is where registered algorithms (see Algorithm Catalog) can be run at scale in the cloud. The MAAP system provides a Jupyter GUI to run Jobs, or the maap.py library can be used to run a batch of Jobs in a loop using Python. The DPS also has monitoring capabilities, and again the MAAP system provides a Jupyter GUI to help monitor Jobs. This can also be done using maap.py in Python.

  • The Algorithm Catalog, where your algorithms from the ADE can be registered and compiled for use by the DPS. The MAAP system provides API and GUI tools to help you register and view your algorithms.

  • The Code Repository is a git-based repository to store user code. It is also used to store the configuration files necessary for building algorithms to store in the algorithm catalog and for execution in the DPS.

  • Input data comes from a few Data Catalogs. Currently there is a MAAP STAC Catalog and the NASA CMR Catalog. More information can be found in the search tutorials section.