Writing and Managing Code with MAAP

Writing and editing code in the MAAP is done in a Jupyter workspace.

Code is version-controlled using git, which may be GitHub or MAAP GitLab. Jupyter also provides a GUI widget to help with code push/pull as a sidebar tool. Git is intended to help with collaborative code development and version-control.

Note

If you have not used Github or git before, it is highly recommended that you get acquainted with it. For a quick reference to git commands there is a Git Cheat Sheet in a variety of languages.

To assist connections to the MAAP system from a Jupyter notebook, a helper library called maap.py provides Python-native calls to the underlying RESTful MAAP API. Often a separate Jupyter notebook is used to run and monitor jobs with API calls.

Writing code overview in context diagram

Helpful Templates while developing Algorithms in MAAP

  • This algorithm repository example is a good starting point for a new algorithm, as it contains the various accessory files that facilitate running the algorithm at scale

  • Which templates will help you? Let the development or documentation team know!

  • For example: conda.yml with some default packages, run_script.sh

Working with code repositories like GitHub and GitLab

Connecting to Github using a Personal Access Token

Set personal access token: https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token

Clone a Repository with GitHub

Here is an example repository you can use for this getting started guide: https://github.com/MAAP-Project/dps-unit-test

  1. Copy the Github clone link from https://github.com/MAAP-Project/dps-unit-test Copy .git link

  2. Open the built-in Jupyter Github UI to the left of the file browser. Choose “Clone a Repository” and paste in the .git link you copied from the Github repository. Clone a Repository Paste .git link

  3. You should see a new folder created with the repo you cloned. If you browse to that folder and open up the Jupyter Github UI again, it will show you some info about that repo. Algorithm folder was created Browse to folder Look at Github UI

  4. If you want to make changes to the code and have your own copy of it to register, clone the code into a public repository in Github or in MAAP Gitlab.

The MAAP GitLab Code repository

After creating your MAAP account, you can create a code repository by navigating to the MAAP GitLab account at https://repo.maap-project.org. This GitLab account is connected to your ADE workspaces automatically when signing into the ADE.

This example walks through cloning a repository into the ADE. Cloning a repository allows you to open, edit, and run files contained within the cloned repository. In this example, we look at cloning the “MAAP-Project/maap-documentation” Git repository, so that you are able to experiment with the code examples contained within this user documentation.

When inside of a workspace, navigate to Git tab at the top of the Jupyter window. Click it to see the option to Clone.

Git Clone

We can also access the “Clone a repository” dialogue box by selecting the File Browser File Browser tab on the JupyterLab sidebar, browsing to the location where we want our Git repository, and using the Git button located near the File Browser icon (also to the left of the file list) and choosing “Clone a repository”. The dialogue box prompts you to enter the URI of the repository you wish to clone. For this example we enter “https://github.com/MAAP-Project/maap-documentation.git”.

For future reference, this URI can be found by visiting the GitHub site for the “MAAP-Project/maap-documentation” Git repository and clicking the Code button. Code Button

With the File Browser tab on the JupyterLab sidebar selected, a folder named “maap-documentation” should now appear at the location where you did the Git Clone operation. Folders for the various sections of the guide can be found in the “docs/source/” directory.

docs/source/

To open the IPython Notebook for an example, go to a section directory and double-click on appropriate “.ipynb” file. For more information about the using Git in Jupyterlab, see https://github.com/jupyterlab/jupyterlab-git .

Working with the MAAP GitLab

Note

git can behave slowly and strangely over s3 bucket-based storage (i.e., my-private-bucket and my-public-bucket). It is recommended to set up your git-tracked repos on the root (somewhere inside of ~ or /projects).

The MAAP GitLab instance is located at https://repo.maap-project.org/ . Make sure you can access this from the browser using your MAAP (EarthData Login) credentials.

For NASA security reasons, MAAP cannot communicate with its GitLab instance over SSH. There also isn’t a username-password authentication option. Therefore, the recommended way to access MAAP repositories is to use GitLab Personal Access Tokens.

  1. In GitLab, in the top-right corner, click your user icon → “Preferences” Preferences

  2. In the “User settings” menu, navigate to “Access Tokens”. Access Tokens

  3. Create a new token with at least “read_repository” and “write_repository” permissions. New Token Configuration

  4. After clicking “create personal access token”, you’ll see a message like this pop up. Make sure you copy this token into a text file — you will not be able to access it again. Access Token Popup Message

  5. In the MAAP ADE, include this access token as part of the remote URL; e.g.,

git clone https://username:AccessToken@repo.maap-project.org/username/repo_name

For example:

git clone https://ashiklom:JJVimxhV8nmRNDqcCNr7@repo.maap-project.org/ashiklom/fireatlas

If you want to use multiple code repositories, it’s possible to configure a repository to have multiple remotes — e.g.,

To add the maap remote and set the URL, use this:

git remote add maap https://username:AccessToken@repo.maap-project.org/username/repo_name

If you already have the remote called maap set up, you can set the remote URL using this instead:

git remote set-url maap https://username:AccessToken@repo.maap-project.org/username/repo_name

Then, you can use these commands to push your code and effectively synchronize between Github and MAAP GitLab (for algorithm registration):

Push to Github:

git push origin <branch name>

Push to MAAP GitLab:

git push maap <branch name>

Customizing your workspace environment

Your Jupyter workspace has a set of pre-installed libraries, depending on which Stack you selected. If you need libraries that are not pre-installed, we suggest using an environment manager; conda is pre-installed to help with this.

Full documentation on configuring conda may be found in the System Reference Guide.

Using maap.py to access MAAP functionality from Python notebooks

The MAAP platform offers a variety of functionality to run and monitor large-scale processing jobs. Access to the functionality is gained via the underlying RESTful MAAP API. In a Python notebook, you will typically use this API via a helper library called maap.py, which will make using MAAP platform features easy, using Python syntax. For example, registering algorithms, running batches of jobs, monitoring jobs, or accessing data.

Much of the maap.py functionality is documented in the Technical Tutorials section and in-context in the Science Tutorials. The maap-py Github page has additional usage documentation.