Searching for Data in NASA’s CMR in R

Authors: Sheyenne Kirkland (UAH), Alex Mandel (DevSeed), Henry Rodman (DevSeed), Zac Deziel (DevSeed)

Date: 11/21/24

Description: In this notebook, we’ll demonstrate how to access data from NASA’s CMR within R using maap-py. Users will learn how to search for collections, granules and links, then compile a list of granule IDs and links.

Run This Notebook

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the “Getting started with the MAAP” section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors. Users should work within the “R/Python” workspace.

Additional Resources

Import/Install Packages

Let’s load the packages needed for this notebook.

[1]:
library(reticulate)

Search Collections

Before beginning our search, let’s invoke the MAAP constructor. This will allow us to use the python-based maap-py library from R.

[2]:
maap_py <- import("maap.maap")
maap <- maap_py$MAAP()

Now let’s search for a collection. The specific collection we have in mind is ATL08, so we will search for collections with that short name. Additionally, we want our data to be hosted within the cloud, so we will add the parameter cloud_hosted=true. If you are not sure of the version, that line can be commented out. However, we know the current version is 006.

[3]:
atl08_collections = maap$searchCollection(
    short_name='ATL08',
    version='006',
    cmr_host='cmr.earthdata.nasa.gov',
    cloud_hosted='true'
)
length(atl08_collections)
1

One collection was returned to us. To grab the concept ID of the collection, we’ll use the code in the following cell.

[4]:
collection_id = atl08_collections[[1]]['concept-id']
print(collection_id)
[1] "C2613553260-NSIDC_CPRD"

Search Granules

Temporal Extent

Now that we have our collection ID, let’s search for granules within the collection. We’ll also add a temporal filter to our search. If you would like to search for granules without the temporal filter, simply comment out or remove the temporal=date_range line.

[5]:
date_range <- '2018-12-01T00:00:00Z,2018-12-31T23:59:59Z'

results = maap$searchGranule(
    temporal=date_range,
    concept_id=collection_id,
    limit=as.integer(100),
    cmr_host='cmr.earthdata.nasa.gov'
)
length(results)
100

100 results were returned. There are thousands of granules within this date range, but because we set our limit to 100, we only get 100 back.

Spatial Extent

Another filter we can apply is a spatial filter.

[6]:
collection_id = 'C2763266360-LPCLOUD'
granule_bbox = '8.79799563969,-3.97882659263,14.4254557634,2.32675751384' # specify bounding box to search by

results = maap$searchGranule(
    concept_id=collection_id,
    bounding_box=granule_bbox,
    limit=as.integer(100),
    cmr_host="cmr.earthdata.nasa.gov"
)
length(results)
43

43 granules in the collection fell within our specified bounding coordinates. Let’s grab the granule file name and the geometry.

[7]:
granule_filename = results[[1]]['Granule']['DataGranule']['ProducerGranuleId']
print(granule_filename)

geometry = results[[1]]['Granule']['Spatial']['HorizontalSpatialDomain']['Geometry']
print(geometry)
[1] "N00E013.SRTMGL1.hgt"
{'BoundingRectangle': {'WestBoundingCoordinate': '12.99972222', 'NorthBoundingCoordinate': '1.00027778', 'EastBoundingCoordinate': '14.00027778', 'SouthBoundingCoordinate': '-0.00027778'}}

Granule ID List

If you need multiple granules, you can also compile a list with multiple granule IDs from our search results.

[14]:
granule_list <- c()

for (result in results) {
    granule_list <- c(granule_list, (result['concept-id']))
}

print(granule_list[1:5])
[1] "G2821018750-LPCLOUD" "G2821036920-LPCLOUD" "G2821037023-LPCLOUD"
[4] "G2821037092-LPCLOUD" "G2821037143-LPCLOUD"