Searching for Granules in MAAP

These examples will walk through the MAAP API functionality of searching granules within a collection based on specific parameters. Granules are individual files from a sensor where a group of granules make a collection within CMR. The granules are the raw data that will be used for processing.

We begin by importing the MAAP and pprint packages. Then invoke the MAAP constructor, setting the maap_host argument to 'api.ops.maap-project.org'.

[1]:
# import the MAAP package
from maap.maap import MAAP

# import printing package to help display outputs
from pprint import pprint

# invoke the MAAP constructor using the maap_host argument
maap = MAAP(maap_host='api.ops.maap-project.org')

Here we view the specific arguments and keywords for the maap.searchGranule function.

[2]:
help(maap.searchGranule)
Help on method searchGranule in module maap.maap:

searchGranule(limit=20, **kwargs) method of maap.maap.MAAP instance
    Search the CMR granules

    :param limit: limit of the number of results
    :param kwargs: search parameters
    :return: list of results (<Instance of Result>)

As we can see from the result, maap.searchGranule accepts a limit keyword which limits the number of results from CMR. maap.searchGranule() also accepts any additional search parameters that are included in CMR. For a list of accepted parameters, please refer to the CMR Search Granules API reference

It is important to note that the default limit on results from the MAAP API is 20. To increase the number of results we will specify a variable and use it in later queries.

[3]:
# get at max 500 results from CMR
MAXRESULTS = 500

In this example we will explore search options including:

  1. Searching by Collection Concept ID
  2. Searching by temporal filter
  3. Searching by spatial filter
  4. Using the results from one search as inputs into another
  5. Searching by additional attributes

For the next couple of examples, we will focus on the IceSat-2/ATLAS Land and Vegetation Height dataset.

Searching by Collection Concept ID

Here we will search by a unique ID that is given to CMR collections. You can find the collection IDs for all of the collections in MAAP in a table within the documentation: https://maap-project.readthedocs.io/en/latest/search/cmr_collection_table.html

It is recommended to begin the search with the Collection Concept ID as this is a specific unique identifier for a collection and will avoid ambiguity when searching by a long name or short name.

[4]:
COLLECTIONID = 'C1201746153-NASA_MAAP' # specify the collection id for the ATLAS dataset

results = maap.searchGranule(concept_id=COLLECTIONID,limit=MAXRESULTS)
pprint(f'Got {len(results)} results')
'Got 500 results'

We were able to get 500 results! There are most likely more than 500 granules in search results, but remember we limited the results to 500 granules. The result from the MAAP API is a list of granules where each element in the list is the metadata for that particular granule.

Now let’s look at the metadata for the first result.

[5]:
# print the first granule's metadata
# we use the depth parameter to set the layer of metadata detail in the results, with (1) having the least detail
# (1) displays the collection concept ID, concept ID, format, and revision ID
# adjust the depth to a larger value (6) if you would like to view all of the metadata
pprint(results[0],depth=2)
{'Granule': {'AdditionalAttributes': {...},
             'Collection': {...},
             'DataFormat': 'HDF5',
             'DataGranule': {...},
             'GranuleUR': 'SC:ATL08.005:228968197',
             'InsertTime': '2021-11-05T13:31:43.040Z',
             'LastUpdate': '2021-11-22T20:24:55.537Z',
             'OnlineAccessURLs': {...},
             'OrbitCalculatedSpatialDomains': {...},
             'Orderable': 'true',
             'Spatial': {...},
             'Temporal': {...},
             'Visible': 'true'},
 'collection-concept-id': 'C1201746153-NASA_MAAP',
 'concept-id': 'G1201746154-NASA_MAAP',
 'format': 'application/echo10+xml',
 'revision-id': '9'}

There is a lot of information in the metadata so let’s break it down…

The Granule key has all of the granule information including attributes, browse imagery URLs, spatial, and temporal information. The collection-concept-id should match what you searched by and be the same for each granule. Lastly the granule specific concept-id is a unique identifier for this granule. This information can be used to further refine search results from CMR, specifically the granule information.

Searching by Temporal Filter

Here we will combine a search from earlier using the Collection Concept ID with a temporal filter to fine tune our results using the temporal keyword in our search.

The temporal keyword takes datetime information in a specific format. The date format used is YYYY-MM-DDThh:mm:ssZ and temporal search criteria may be either a single date or a date range. If one date is provided then it can be inferred as start or end date. To define a start date and return all granules after the date, put a comma after the date (YYYY-MM-DDThh:mm:ssZ,). To define an end date and return all granules prior to the data, put a comma before the date (,YYYY-MM-DDThh:mm:ssZ). Lastly, to get a date range, provide the start date and end date separated by a comma (YYYY-MM-DDThh:mm:ssZ,YYYY-MM-DDThh:mm:ssZ).

In this example we will search for one month of data.

[6]:
dateRange = '2018-12-01T00:00:00Z,2018-12-31T23:59:59Z' # specify a date range to search for data for Dec. 2018

results = maap.searchGranule(temporal=dateRange,concept_id=COLLECTIONID,limit=MAXRESULTS,)
pprint(f'Got {len(results)} results')
'Got 500 results'
[7]:
granuleFilename = results[0]['Granule']['DataGranule']['ProducerGranuleId'] # get the granule file name
granuleDate = results[0]['Granule']['Temporal']['RangeDateTime']['BeginningDateTime'] # get the granule start time

pprint(f'Granule {granuleFilename} was acquired starting at {granuleDate}',width=100)
'Granule ATL08_20181201001339_09680103_005_01.h5 was acquired starting at 2018-12-01T00:13:48.477Z'

It looks like the first result correctly matches with the beginning temporal search parameter. Keep in mind that the results are limited to 500 so the final granule returned may not match the end date that was searched for.

Searching by Spatial Filter

Here we will illustrate how to search for granules by a spatial filter. There are a couple of spatial filters available to search by in CMR including point, line, polygon, and bounding box. The most simple to use is the bounding box which is a sequence of four latitude and longitude values in the order of [W,S,E,N]. In this example, we are going to search data from the NASA Shuttle Radar Topography Mission Global 1 arc second V003 dataset over Gabon using the bounding_box keyword.

[8]:
granuleDomain = '8.79799563969,-3.97882659263,14.4254557634,2.32675751384' # specify bounding box to search by

results = maap.searchGranule(bounding_box=granuleDomain,concept_id='C1200000522-NASA_MAAP',limit=MAXRESULTS)
pprint(f'Got {len(results)} results')
'Got 43 results'
[9]:
granuleFilename = results[0]['Granule']['DataGranule']['ProducerGranuleId'] # get the granule file name
granuleDate = results[0]['Granule']['Spatial']['HorizontalSpatialDomain']['Geometry'] # grab the spatial information from granule

pprint(f'Granule {granuleFilename} was acquired within the following geometry: ',width=100)
pprint(granuleDate)
'Granule N00E014.SRTMGL1.hgt.zip was acquired within the following geometry: '
{'BoundingRectangle': {'EastBoundingCoordinate': '15.00027778',
                       'NorthBoundingCoordinate': '1.00027778',
                       'SouthBoundingCoordinate': '-0.00027778',
                       'WestBoundingCoordinate': '13.99972222'}}

We can see from the first granule that the spatial coordinates of the granule intersect our search box. Also note that spatial filtering yields more refined search results with only 43 granules returned.

Searching by Additional Attributes

The MAAP has provided additional metadata, also called additional attributes, to the granule metadata to support the unique search needs of the aboveground terrestrial carbon research community. There are many additional attributes available. To get started users can search by the following keywords:

  • site_name
  • data_format
  • track_number
  • polarization

For example, if a user is only interested in using data from the Mondah Forest Gabon site with a data format of ASCII we can use the following query:

[10]:
results = maap.searchGranule(site_name="Mondah Forest Gabon",data_format='ASCII',concept_id='C1200110729-NASA_MAAP')
pprint(f'Got {len(results)} results')
'Got 17 results'
[11]:
pprint(results[0],depth=2)
{'Granule': {'AdditionalAttributes': {...},
             'Collection': {...},
             'DataFormat': 'ASCII',
             'DataGranule': {...},
             'GranuleUR': 'SC:AFLVIS2.001:138348895',
             'InsertTime': '2018-09-24T11:00:23.343Z',
             'LastUpdate': '2018-09-24T11:00:54.336Z',
             'OnlineAccessURLs': {...},
             'OnlineResources': {...},
             'Orderable': 'true',
             'Platforms': {...},
             'Spatial': {...},
             'Temporal': {...},
             'Visible': 'true'},
 'collection-concept-id': 'C1200110729-NASA_MAAP',
 'concept-id': 'G1200115690-NASA_MAAP',
 'format': 'application/echo10+xml',
 'revision-id': '6'}

The returned results will give you only ASCII datasets that have been tagged as part of the Mondah Forest Gabon research site.

DISCLAIMER: The MAAP data team is working to update the additional attributes within the MAAP platform so these are subject to change. Furthermore, the accepted parameters for the additional attributes are changing and further documentation will be provided as updates come.

The MAAP API provides rich functionality to interact with the CMR instance within the MAAP platform. Users can search datasets programmatically by many parameters and even combine parameters such as spatial and temporal filters to refine results.