GEDI S3 Bucket Access at LPDAAC (BETA)

Authors: Alex Mandel (Development Seed), Brian Freitag (NASA MSFC), Jamison French (Development Seed)

Description: In this tutorial, we demonstrate how to use transform HTTPS links into their corresponding S3 links to retrieve GEDI data hosted by the Land Processes Distributed Active Archive Center (LP DAAC).

This tutorial demonstrates a temporary workaround with the expectation that direct access links for LPDAAC GEDI data will eventually be available through NASA CMR.

Run This Notebook

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the “Getting started with the MAAP” section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.

Additional Resources

Importing Packages

If the packages below are not installed already, uncomment the following cell

[ ]:
# %pip install h5py fsspec s3fs --quiet
[2]:
import h5py
import boto3
import botocore
import fsspec
from maap.maap import MAAP

maap = MAAP(maap_host="api.maap-project.org")

Searching the Data

We’ll start by gathering a sample list of granules from the GEDI L2A collection. The HTTPS links we’re after are nested within the granule object.

[3]:
results = maap.searchGranule(
    concept_id="C1908348134-LPDAAC_ECS",  # GEDI-L2A
    cmr_host="cmr.earthdata.nasa.gov",
    limit=10,
)

# Download URL of GEDI L2A product
print(results[0].getDownloadUrl())
https://e4ftl01.cr.usgs.gov//GEDI_L1_L2/GEDI/GEDI02_A.002/2019.04.18/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5

Converting the Paths

We’ll create a helper function to handle the link conversions to AWS S3 links.

[4]:
def lpdaac_gedi_https_to_s3(url):
    dir_comps = url.split("/")
    return f"s3://lp-prod-protected/{dir_comps[6]}/{dir_comps[8].strip('.h5')}/{dir_comps[8]}"


# Sample
lpdaac_gedi_https_to_s3(results[0].getDownloadUrl())
[4]:
's3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5'

Accessing the Data

We’ll fetch temporary S3 credentials for LPDAAC data and then view the BEAM groups of the first GEDI link in our results.

[5]:
credentials = maap.aws.earthdata_s3_credentials(
    'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'
)

s3 = fsspec.filesystem(
    "s3",
    key=credentials['accessKeyId'],
    secret=credentials['secretAccessKey'],
    token=credentials['sessionToken']
)
[6]:
with s3.open(lpdaac_gedi_https_to_s3(results[0]._location), "rb") as f:
    gedi_data = h5py.File(f, "r")
    print(gedi_data.keys())
<KeysViewHDF5 ['BEAM0000', 'BEAM0001', 'BEAM0010', 'BEAM0011', 'BEAM0101', 'BEAM0110', 'BEAM1000', 'BEAM1011', 'METADATA']>
[ ]: