Writing and Managing Code with MAAP

Writing and editing code in the MAAP is done in a Jupyter workspace.

Code is version-controlled using git, which may be GitHub or MAAP GitLab. Jupyter also provides a GUI widget to help with code push/pull as a sidebar tool. Git is intended to help with collaborative code development and version-control.

Note

If you have not used Github or git before, it is highly recommended that you get acquainted with it. For a quick reference to git commands there is a Git Cheat Sheet in a variety of languages.

To assist connections to the MAAP system from a Jupyter notebook, a helper library called maap.py provides Python-native calls to the underlying RESTful MAAP API. Often a separate Jupyter notebook is used to run and monitor jobs with API calls. When working with Jupyter notebooks, a manual save must be done to create a checkpoint. Checkpoints are an emergency backup of the notebook and are different than the automatic saving of the notebook. Jupyter notebook checkpoints bloat git unnecessarily, so you can add the following to your .gitignore file to prevent this:

.ipynb_checkpoints
*/.ipynb_checkpoints/*

For more information on .gitignore files, see here.

Writing code overview in context diagram

Helpful Templates while developing Algorithms in MAAP

  • This algorithm repository example is a good starting point for a new algorithm, as it contains the various accessory files that facilitate running the algorithm at scale

  • Which templates will help you? Let the development or documentation team know!

  • For example: conda.yml with some default packages, run_script.sh

Working with code repositories like GitHub and GitLab

Connecting to Github using a Personal Access Token

Set personal access token: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token

Clone a Repository with GitHub

Here is an example repository you can use for this getting started guide: https://github.com/MAAP-Project/dps-unit-test

  1. Copy the Github clone link from https://github.com/MAAP-Project/dps-unit-test Copy .git link

  2. Open the built-in Jupyter Github UI to the left of the file browser. Choose “Clone a Repository” and paste in the .git link you copied from the Github repository. You can also access this menu through the Git tab at the top of the Jupyter window. Clone a Repository Paste .git link

  3. You should see a new folder created with the repo you cloned. If you browse to that folder and open up the Jupyter Github UI again, it will show you some info about that repo. Algorithm folder was created Browse to folder Look at Github UI

  4. If you want to make changes to the code and have your own copy of it to register, clone the code into a public repository in Github or in MAAP Gitlab.

To open the IPython Notebook, go to a section directory and double-click on appropriate “.ipynb” file. For more information about the using Git in Jupyterlab, see https://github.com/jupyterlab/jupyterlab-git .

The MAAP GitLab Code repository

After creating your MAAP account, you can create a code repository by navigating to the MAAP GitLab account at https://repo.maap-project.org. This GitLab account is connected to your ADE workspaces automatically when signing into the ADE.

You can then follow the same steps above to clone a repository from the MAAP GitLab.

Customizing your workspace environment

Your Jupyter workspace has a set of pre-installed libraries, depending on which Stack you selected. If you need libraries that are not pre-installed, we suggest using an environment manager; conda is pre-installed to help with this.

Full documentation on configuring conda may be found in the System Reference Guide.

Using maap.py to access MAAP functionality from Python notebooks

The MAAP platform offers a variety of functionality to run and monitor large-scale processing jobs. Access to the functionality is gained via the underlying RESTful MAAP API. In a Python notebook, you will typically use this API via a helper library called maap.py, which will make using MAAP platform features easy, using Python syntax. For example, registering algorithms, running batches of jobs, monitoring jobs, or accessing data.

Much of the maap.py functionality is documented in the Technical Tutorials section and in-context in the Science Tutorials. The maap-py Github page has additional usage documentation.