Searching for Collections in MAAP

These examples walk through the MAAP API functionality of searching for collections based on specific parameters. Collections are groupings of files that share the same product specification. Searching for collections can be useful for finding individual files, known as granules, which are used for processing.

We begin by importing the MAAP package and creating a new MAAP class.

[1]:
# import the MAAP package to handle queries
from maap.maap import MAAP

# import printing package to help display outputs
from pprint import pprint

# invoke the MAAP search client
maap = MAAP()

We can use the maap.searchCollection function to return a list of desired collections. Before using this function, let’s use the help function to view the specific arguments and keywords for maap.searchCollection.

[2]:
# view help for the searchCollection function
help(maap.searchCollection)
Help on method searchCollection in module maap.maap:

searchCollection(limit=100, **kwargs) method of maap.maap.MAAP instance
    Search the CMR collections
    :param limit: limit of the number of results
    :param kwargs: search parameters
    :return: list of results (<Instance of Result>)

The help text is showing that maap.searchCollection accepts a limit and search parameters. The limit parameter limits the number of resulting collections returned by maap.searchCollection. Note that limit=100 means that the default limit for results from the MAAP API is 100. maap.searchCollection accepts any additional search parameters that are included in the CMR. For a list of accepted parameters, please refer to the CMR Search Collections API reference.

In this example we will explore search options including:

  1. Finding all Collections
  2. Searching by temporal filter
  3. Searching by spatial filter
  4. Using the results from one search as inputs into another
  5. Searching by additional attributes

Finding all Collections

Here we will demonstrate how to create a list containing all of the collections contained within the CMR. To do this, we will use the maap.searchCollection function without any additional search parameters.

[3]:
# search all collections
results = maap.searchCollection()

# print the number of collections
pprint(f'Got {len(results)} results')
'Got 61 results'

We were able to get 61 results. The result from the MAAP API is a list of collections where each element in the list is the metadata for that particular collection. Note that as more collections are added to the CMR, it is important to remember that the default limit on results is 100. To change the limit, type limit= and then a value within the parentheses after maap.searchCollection().

Let’s look at the metadata for the first collection in our list of results (results[0]) using pprint. For formatting purposes, we can use the depth parameter to control the number of levels of metadata detail to display. By default, there is no constraint on the depth. By setting a depth parameter (in this case depth=2), we can ensure that the next contained level is replaced by an ellipsis.

[4]:
# print the metadata for the first collection
# we use the depth parameter to set the layer of metadata detail in the results, with (1) having the least detail
# (1) displays the concept ID, format, and revision ID
# adjust the depth to a larger value (6) if you would like to view all of the metadata
pprint(results[0],depth=2)
{'Collection': {'AdditionalAttributes': {...},
                'ArchiveCenter': 'NASA/NSIDC_DAAC',
                'Campaigns': {...},
                'CollectionState': 'COMPLETE',
                'Contacts': {...},
                'DOI': {...},
                'DataSetId': 'ABoVE LVIS L1B Geolocated Return Energy '
                             'Waveforms V001',
                'Description': 'This data set contains laser altimetry return '
                               'energy waveform measurements over Alaska and '
                               'Western Canada taken from the NASA Land, '
                               'Vegetation, and Ice Sensor (LVIS). The data '
                               "were collected as part of NASA's Terrestrial "
                               'Ecology Program campaign, the Arctic-Boreal '
                               'Vulnerability Experiment (ABoVE).',
                'InsertTime': '2020-10-17T20:32:38.639Z',
                'LastUpdate': '2020-10-17T20:32:38.639Z',
                'LongName': 'Not provided',
                'OnlineAccessURLs': {...},
                'OnlineResources': {...},
                'Orderable': 'false',
                'Platforms': {...},
                'ProcessingLevelId': '1B',
                'RevisionDate': '2019-09-06T19:27:00.000Z',
                'ScienceKeywords': {...},
                'ShortName': 'ABLVIS1B',
                'Spatial': {...},
                'SpatialKeywords': {...},
                'Temporal': {...},
                'VersionId': '001',
                'Visible': 'true'},
 'concept-id': 'C1200110748-NASA_MAAP',
 'format': 'application/echo10+xml',
 'revision-id': '11'}

The Collection key has all of the collection information including attributes, the archive center, spatial, and temporal information. The concept-id is a unique identifier for this collection. It can be used to further refine search results from the CMR, such as when searching for granule information.

Searching by Temporal Filter

Here we use a temporal filter to narrow down our results using the temporal keyword in our search. The temporal keyword takes datetime information in a specific format. The date format used is YYYY-MM-DDThh:mm:ssZ and temporal search criteria may be either a single date or a date range. If one date is provided then it can be inferred as the start or end date. To define a start date and return all collections after the date, put a comma after the date (YYYY-MM-DDThh:mm:ssZ,). To define a end date and return all granules prior to the data, put a comma before the date (,YYYY-MM-DDThh:mm:ssZ). Lastly, to get a date range, provide the start date and end date separated by a comma (YYYY-MM-DDThh:mm:ssZ,YYYY-MM-DDThh:mm:ssZ).

In this example we will search for one month of data.

[5]:
dateRange = '2000-01-01T00:00:00Z,2000-01-31T23:59:59Z' # specify date range to search for data in January 2000

results = maap.searchCollection(temporal=dateRange)
pprint(f'Got {len(results)} results')
'Got 3 results'
[6]:
collectionFilename = results[0]['Collection']['ShortName'] # get the collection short name
collectionDate = results[0]['Collection']['Temporal']['RangeDateTime']['BeginningDateTime'] # get the collection start time

pprint(f'Collection {collectionFilename} was acquired starting at {collectionDate}',width=100)
'Collection Global_Forest_Change_2000-2017 was acquired starting at 2000-01-01T00:00:00.000Z'

It appears the first result correctly matches with the beginning and ending temporal search parameters. Keep in mind that the results are limited to 100 so the final collection returned may not match the end date that was searched for.

Searching by Spatial Filter

Here we will illustrate how to search for collections by a spatial filter. There are a couple of spatial filters available to search by in the CMR including point, line, polygon, and bounding box. In this example, we will explore filtering with a bounding box which is a sequence of four latitude and longitude values in the order of [W,S,E,N].

[7]:
collectionDomain = '-42,10,42,20' # specify bounding box to search by

results = maap.searchCollection(bounding_box=collectionDomain)
pprint(f'Got {len(results)} results')
'Got 17 results'
[8]:
collectionFilename = results[3]['Collection']['ShortName'] # get a collection short name
collectionDate = results[3]['Collection']['Spatial']['HorizontalSpatialDomain']['Geometry'] # grab the spatial information from collection

pprint(f'Collection {collectionFilename} was acquired within the following geometry: ',width=100)
pprint(collectionDate)
'Collection GEDI01_B was acquired within the following geometry: '
{'BoundingRectangle': {'EastBoundingCoordinate': '180.0',
                       'NorthBoundingCoordinate': '55.0',
                       'SouthBoundingCoordinate': '-55.0',
                       'WestBoundingCoordinate': '-180.0'},
 'CoordinateSystem': 'CARTESIAN'}

We can see from the first collection that the spatial coordinates of the collection intersect our search box. Also note that spatial filtering yields more refined search results with only 16 collections returned.

Searching by Additional Attributes

The MAAP has provided additional metadata, known as additional attributes, to the collection metadata to support the unique search needs of the aboveground terrestrial carbon research community. There are many additional attributes available:

variable name additional attribute name data type
site_name Site Name string
data_format Data Format string
track_number Track Number float
polarization Polarization string
dataset_status Dataset Status string
spat_res Spatial Resolution float
samp_freq Sampling Frequency float
band_ctr_freq Band Center Frequency float
freq_band_name Frequency Band Name string
swath_width Swath Width float
field_view Field of View float
laser_foot_diam Laser Footprint Diameter float
pass_number Pass Number int
revisit_time Revisit Time float
flt_number Flight Number int
number_plots Number of Plots int
plot_area Plot Area float
br_ht Breast Height float
ret_per_pulse Returns Per Pulse string
min_diam_meas Minimum Diameter Measured float
flt_alt Flight Altitude float
hdg Heading float
swath_slant_rg_st_ang Swath Slant Range Start Angle float
azm_rg_px_spacing Azimuth Range Pixel Spacing float
slant_rg_px_spacing Slant Range Pixel Spacing float
acq_type Acquisition Type string
orbit_dir Orbit Direction string
band_ctr_wavelength Band Center Wavelength float
swath_slant_rg_end_ang Swath Slant Range End Angle float

For example, if a user is only interested in using data from the Lope National Park Gabon site, we can use the following query:

[9]:
results = maap.searchCollection(site_name="Lope National Park Gabon")
pprint(f'Got {len(results)} results')
'Got 5 results'
[10]:
pprint(results[0],depth=2)
{'Collection': {'AdditionalAttributes': {...},
                'ArchiveCenter': 'MAAP Data Management Team',
                'Campaigns': {...},
                'CollectionDataType': 'SCIENCE_QUALITY',
                'CollectionState': 'COMPLETE',
                'Contacts': {...},
                'DataSetId': 'AfriSAR UAVSAR Geocoded Covariance Matrix '
                             'product Generated Using NISAR Tools',
                'Description': 'The Geocoded Covariance Matrix dataset is the '
                               '4x4 Native Covariance Matrix geocoded to a '
                               'spatial resolution of 25m using cubic '
                               'interpolation methods.  These covariance '
                               'matrix datasets provides the variability '
                               'between retrievals for each co-registered '
                               'single-look-complex (SLC) polarization and '
                               'provide inferences on the scattering '
                               'characteristics of a target.  The SLC data was '
                               'collected from repeat-pass flights of the '
                               'Uninhabited Aerial Vehicle Synthetic Aperture '
                               'Radar (UAVSAR) instrument during the AfriSAR '
                               'field campaign over the Gabonese forest in '
                               'February-March 2016.  The AfriSAR campaign was '
                               'a collaborative airborne mission between the '
                               'National Aeronautics and Space Administration, '
                               'the European Space Agency and the Gabonese '
                               'Space Agency collecting radar, lidar and field '
                               'measurements in support of future space borne '
                               'missions for biomass research.',
                'InsertTime': '2020-10-17T20:32:38.676Z',
                'LastUpdate': '2020-10-17T20:32:38.676Z',
                'LongName': 'Not provided',
                'OnlineAccessURLs': {...},
                'OnlineResources': {...},
                'Orderable': 'false',
                'Platforms': {...},
                'ProcessingLevelDescription': 'Geocoded and mapped to uniform '
                                              'spatial grid scales',
                'ProcessingLevelId': '3',
                'RevisionDate': '2019-04-08T21:02:00.000Z',
                'ScienceKeywords': {...},
                'ShortName': 'AfriSAR_UAVSAR_Geocoded_Covariance',
                'Spatial': {...},
                'Temporal': {...},
                'VersionId': '1',
                'Visible': 'true'},
 'concept-id': 'C1200109238-NASA_MAAP',
 'format': 'application/echo10+xml',
 'revision-id': '5'}

The returned results will give you only datasets that have been tagged as part of the Lope National Park Gabon research site.