Introduction to the GEDI02_B Product

The Global Ecosystem Dynamics Investigation (GEDI) mission aims to characterize ecosystem structure and dynamics to enable radically improved quantification and understanding of the Earth’s carbon cycle and biodiversity. The GEDI instrument produces high resolution laser ranging observations of the 3-dimensional structure of the Earth. GEDI is attached to the International Space Station and collects data globally between 51.6° N and 51.6° S latitudes at the highest resolution and densest sampling of any light detection and ranging (lidar) instrument in orbit to date.

The purpose of the GEDI Level 2B Canopy Cover and Vertical Profile Metrics product (GEDI02_B) is to extract biophysical metrics from each GEDI waveform. These metrics are based on the directional gap probability profile derived from the L1B waveform. Metrics provided include canopy cover, Plant Area Index (PAI), Plant Area Volume Density (PAVD), and Foliage Height Diversity (FHD). The GEDI02_B product is provided in HDF-5 format and has a spatial resolution (average footprint) of 25 meters.

The GEDI02_B data product contains 105 layers for each of the eight-beam ground transects (or laser footprints located on the land surface). Datasets provided include precise latitude, longitude, elevation, height, canopy cover, and vertical profile metrics. Additional information for the layers can be found in the GEDI Level 2B Data Dictionary (https://lpdaac.usgs.gov/documents/587/gedi_l2b_dictionary_P001_v1.html).

The GEDI01_B product is provided in HDF5 format and has a spatial resolution (average footprint) of 25 meters.

Opening GEDI02_B files and exploring its datasets

This tutorial opens and exampines a GEDI02_B file and demonstrates how to plot a subset of points in order to understand the spatial extent of that file.

  • Author: Aimee Barciauskas
  • Date: December 16, 2022

Resources used: * Getting Started with GEDI L2B Version 2 Data in Python (LPDAAC Tutorial)

[1]:
import boto3
import folium
import h5py
from os.path import exists
from pystac_client import Client
import pandas as pd
[2]:
# STAC API root URL
URL = 'https://stac.maap-project.org/'
client = Client.open(URL)
[3]:
collection = 'GEDI02_B'
search = client.search(
    max_items = 1,
    collections = collection,
)
[4]:
item = search.get_all_items()[0]
item
[4]:

Access and open the GEDI02_B HDF5 file

Below, we break down the S3 URL into its bucket and key components to support downloading the file with S3. Note, this will only work on platforms configured with access to the GEDI directory of the NASA MAAP Data Store (s3://nasa-maap-data-store/GEDI*). Further, it is strongly preferred that any transfer of data is done on AWS services in AWS region us-west-2 in order for data transfer costs to be minimized.

[5]:
href = item.assets['data'].href
path_parts = href.split('/')
bucket = path_parts[2]
key = href.split(bucket)[1][1:]
filename = path_parts[-1]
bucket, key, filename
[5]:
('nasa-maap-data-store',
 'file-staging/nasa-map/GEDI02_B___002/2021.10.26/GEDI02_B_2021299195557_O16268_02_T06679_02_003_01_V002.h5',
 'GEDI02_B_2021299195557_O16268_02_T06679_02_003_01_V002.h5')
[14]:
# don't want to run this more than once!
s3 = boto3.client('s3')

if exists(filename):
    print("file already downloaded")
else:
    s3.download_file(bucket, key, filename)
    print("finished downloading")
file already downloaded
[15]:
gediL2B = h5py.File(filename, 'r')
[16]:
beamNames = [g for g in gediL2B.keys() if g.startswith('BEAM')]
beamNames
[16]:
['BEAM0000',
 'BEAM0001',
 'BEAM0010',
 'BEAM0011',
 'BEAM0101',
 'BEAM0110',
 'BEAM1000',
 'BEAM1011']

List all the science data sets

to understand what type of data is stored in this dataset better.

[ ]:
beam = beamNames[0]
gediL2B_objs = []
gediL2B.visit(gediL2B_objs.append)                                           # Retrieve list of datasets
gediSDS = [o for o in gediL2B_objs if isinstance(gediL2B[o], h5py.Dataset)]  # Search for relevant SDS inside data file
[i for i in gediSDS if beam in i][0:20]

Plot a subset of the points

First, we create a subset of points to facilitate plotting.

[19]:
lonSample, latSample, shotSample, qualitySample, beamSample = [], [], [], [], []  # Set up lists to store data

# Open the SDS
lats = gediL2B[f'{beamNames[0]}/geolocation/lat_lowestmode'][()]
lons = gediL2B[f'{beamNames[0]}/geolocation/lon_lowestmode'][()]
shots = gediL2B[f'{beamNames[0]}/geolocation/shot_number'][()]
quality = gediL2B[f'{beamNames[0]}/l2b_quality_flag'][()]

# Take every 1000th shot and append to list
for i in range(len(shots)):
    if i % 1000 == 0:
        shotSample.append(str(shots[i]))
        lonSample.append(lons[i])
        latSample.append(lats[i])
        qualitySample.append(quality[i])
        beamSample.append(beamNames[0])

# Write all of the sample shots to a dataframe
latslons = pd.DataFrame({'Beam': beamSample, 'Shot Number': shotSample, 'Longitude': lonSample, 'Latitude': latSample,
                         'Quality Flag': qualitySample})
latslons
[19]:
Beam Shot Number Longitude Latitude Quality Flag
0 BEAM0000 162680000200086772 71.176248 -0.652634 0
1 BEAM0000 162680000200087772 71.475272 -0.228396 0
2 BEAM0000 162680000200088772 71.772920 0.194259 0
3 BEAM0000 162680000200089772 72.070465 0.616924 0
4 BEAM0000 162680000200090772 72.369240 1.040828 0
... ... ... ... ... ...
89 BEAM0000 162680000200175772 102.130037 35.159365 0
90 BEAM0000 162680000200176772 102.593719 35.509484 0
91 BEAM0000 162680000200177772 103.061891 35.857398 0
92 BEAM0000 162680000200178772 103.522337 36.194505 0
93 BEAM0000 162680000200179772 103.755363 36.363258 0

94 rows × 5 columns

[22]:
map = folium.Map(
    location=[
        latslons.Latitude.mean(),
        latslons.Longitude.mean()
    ], zoom_start=3, control_scale=True
)

for index, location_info in latslons.iterrows():
    folium.Circle(
        [location_info["Latitude"], location_info["Longitude"]],
        popup=f"Shot Number: {location_info['Shot Number']}"
    ).add_to(map)

map
[22]:
Make this Notebook Trusted to load map: File -> Trust Notebook