GEDI_L2B Search and Visualize

Authors: Samuel Ayers (UAH), Sumant Jha (MSFC/USRA), Anish Bhusal (UAH), Alex Mandel (DevSeed), Aimee Barciauskas (DevSeed)

Date: December 19, 2022

Description: In this tutorial, we will use the integrated Earthdata search function in MAAP Algorithm Development Environment (ADE) to search for GEDI L2B data for an area of interest. We will then download some of this data and read its attributes in preparation for some analysis. We will perform a spatial subset on the data to reduce data volumes, and then make some basic plots of our data.

Run This Notebook

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the “Getting started with the MAAP” section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.

About the Data

GEDI L2B Canopy Cover and Vertical Profile Metrics Data Global Footprint Level V002

This dataset provides Global Ecosystem Dynamics Investigation (GEDI) Level 2 (L2) data, which has the purpose of extracting biophysical metrics and consists of Canopy Cover and Vertical Profile Metrics. These metrics are derived from the L1B waveform, and include canopy cover, Plant Area Index (PAI), Plant Area Volume Density (PAVD), and Foliage Height Diversity (FHD). GEDI is attached to the International Space Station (ISS) and collects data globally between 51.6° N and 51.6° S latitudes at the highest resolution and densest sampling of any light detection and ranging (lidar) instrument in orbit to date; specifically, GEDI L2B data has a spatial resolution of 25m. (Source: GEDI L2B CMR Search)

Additional Resources

Importing and Installing Packages

We will begin by installing any packages we need and importing the packages that we will use.


  • geopandas

  • folium

[ ]:
# Uncomment the following lines to install these packages if you haven't already.
# !pip install geopandas
# !pip install folium
from maap.maap import MAAP
maap = MAAP(maap_host='')
import geopandas as gpd
import folium
import h5py
import pandas
import matplotlib
import matplotlib.pyplot as plt
import shapely
import os

Search Data Using an AOI

We will search and download GEDI L2B data using the bounding box of a vector AOI. Firstly, an AOI over the Shenandoah National Park will be created and then we will plot its location on a map.

# Using bounding coordinates to create a polygon around Shenandoah National Park
lon_coords = [-78.32129105072025, -78.04618813890727, -78.72985973163064, -79.0158578082679, -78.32129105072025]
lat_coords = [38.88703610703791, 38.74909216350823, 37.88789051477522, 38.03177640342157, 38.88703610703791]

polygon_geom = shapely.geometry.polygon.Polygon(zip(lon_coords, lat_coords))
crs = 'epsg:4326'
AOI = gpd.GeoDataFrame(index=[0], crs=crs, geometry=[polygon_geom])

We can get the bounding box of the AOI so we can use it as a spatial limit on our data search. GeoPandas has a function for returning the spatial coordinates of a bounding box:

# Get the bounding box of the shp
bbox = AOI.bounds
# print the bounding box coords
print('minx = ', bbox['minx'][0])
print('miny = ', bbox['miny'][0])
print('maxx = ', bbox['maxx'][0])
print('maxy = ', bbox['maxy'][0])
minx =  -79.0158578082679
miny =  37.88789051477522
maxx =  -78.04618813890727
maxy =  38.88703610703791

Let’s look at our AOI on an interactive map using folium.

m = folium.Map(location=[38.5, -78], zoom_start=9, tiles='CartoDB positron')
geo_j = folium.GeoJson(data=AOI, style_function=lambda x: {'fillColor': 'orange'})
Make this Notebook Trusted to load map: File -> Trust Notebook

Search using the EarthData Search Integration

Search Data: To search GEDI data we can use the EarthData search integration in the MAAP ADE. Open the Earthdata search toolbar by clicking on the following:


This will open up the EarthData search interface in a new tab. We can use the search bar to search GEDI L2B data. By entering “L2B” in the search bar, we can see the relevant GEDI data is filtered. Click on the dataset to get more details.


While we have been searching for data, the MAAP ADE has been keeping a track of our search parameters. This means that we can easily insert our search for GEDI data straight into our notebook.


Inspect and Filter through the Data

This gives us our search parameters in our notebook using the GEDI dataset ID. The default limit for the number of returned results is 1000. Running this will produce 1000 results, but we can look at the first one by indexing the list of returned results. This is what the data entry looks like. You can see a large amount of metadata for the file along with the URL for where this specific file is stored.

data = maap.searchGranule(cmr_host='', concept_id="C1908350066-LPDAAC_ECS", limit=1000)[0]
{'concept-id': 'G2011069580-LPDAAC_ECS',
 'collection-concept-id': 'C1908350066-LPDAAC_ECS',
 'revision-id': '10',
 'format': 'application/echo10+xml',
 'Granule': {'GranuleUR': 'SC:GEDI02_B.002:2433465280',
  'InsertTime': '2021-02-22T19:16:30.675Z',
  'LastUpdate': '2021-09-16T13:36:07.965Z',
  'Collection': {'DataSetId': 'GEDI L2B Canopy Cover and Vertical Profile Metrics Data Global Footprint Level V002'},
  'DataGranule': {'SizeMBDataGranule': '16.5413',
   'ProducerGranuleId': 'GEDI02_B_2019108002012_O01959_01_T03909_02_003_01_V002.h5',
   'DayNightFlag': 'UNSPECIFIED',
   'ProductionDateTime': '2021-02-21T14:45:08Z'},
  'PGEVersionClass': {'PGEVersion': '003'},
  'Temporal': {'RangeDateTime': {'BeginningDateTime': '2019-04-18T00:20:12.000000Z',
    'EndingDateTime': '2019-04-18T01:52:53.000000Z'}},
  'Spatial': {'HorizontalSpatialDomain': {'Geometry': {'GPolygon': {'Boundary': {'Point': [{'PointLongitude': '80.2890335089',
         'PointLatitude': '-4.6168623465'},
        {'PointLongitude': '82.4542052313', 'PointLatitude': '-1.5568021223'},
        {'PointLongitude': '83.6402279084', 'PointLatitude': '0.1277767863'},
        {'PointLongitude': '83.6814118126', 'PointLatitude': '0.0987901632'},
        {'PointLongitude': '82.4953794073', 'PointLatitude': '-1.5858244175'},
        {'PointLongitude': '80.3302671214',
         'PointLatitude': '-4.6459691487'}]}}}}},
  'OrbitCalculatedSpatialDomains': {'OrbitCalculatedSpatialDomain': {'StartOrbitNumber': '1959',
    'StopOrbitNumber': '1959'}},
  'MeasuredParameters': {'MeasuredParameter': {'ParameterName': 'Level2B'}},
  'Platforms': {'Platform': {'ShortName': 'ISS',
    'Instruments': {'Instrument': {'ShortName': 'GEDI',
      'Sensors': {'Sensor': {'ShortName': 'LIDAR'}}}}}},
  'Campaigns': {'Campaign': {'ShortName': 'GEDI'}},
  'AdditionalAttributes': {'AdditionalAttribute': [{'Name': 'identifier_product_doi',
     'Values': {'Value': '10.5067/GEDI/GEDI02_B.002'}},
    {'Name': 'identifier_product_doi_authority',
     'Values': {'Value': ''}},
    {'Name': 'Reference_Ground_Track', 'Values': {'Value': '3909'}},
    {'Name': 'SEGMENT_NUMBER', 'Values': {'Value': '01'}}]},
  'InputGranules': {'InputGranule': ['GEDI01_B_2019108002012_O01959_01_T03909_02_005_01_V002.h5',
  'OnlineAccessURLs': {'OnlineAccessURL': {'URL': '',
    'URLDescription': 'GEDI02_B_2019108002012_O01959_01_T03909_02_003_01_V002.h5. MimeType: application/x-hdfeos',
    'MimeType': 'application/x-hdfeos'}},
  'OnlineResources': {'OnlineResource': [{'URL': ' ',
     'Description': 'The Landing Page for this file may be accessed directly from this link',
     'Type': 'DOI',
     'MimeType': 'text/html'},
    {'URL': '',
     'Description': 'This Browse file may be downloaded directly from this link',
     'Type': 'BROWSE',
     'MimeType': 'image/jpeg'},
    {'URL': '',
     'Description': 'This Metadata file may be downloaded directly from this link',
     'MimeType': 'text/xml'}]},
  'Orderable': 'true',
  'DataFormat': 'HDF5',
  'Visible': 'true'}}

So far, this search function requests all of the GEDI data but we can add a spatial subset filter using our AOI from above to limit the results. Adding a spatial filter returns 259 GEDI L2B files that intersect with our AOI.

data_aoi = maap.searchGranule(cmr_host='', concept_id="C1908350066-LPDAAC_ECS", bounding_box="-79.0158578082679,37.88789051477522,-78.04618813890727,38.887036107037915", limit=1000)

This is more data than we need, so let’s look at the contents of a single GEDI file. Firstly we need to bring it from the server side (S3) to our local side. This can be done using the MAAP function getData. We’ll create a new data directory, and then we will pull the 7th file in our search results.

# set data directory
dataDir = './data'

# check if directory exists -> if directory doesn't exist, directory is created
if not os.path.exists(dataDir):
# pulling the 7th file into the new directory
gedi_data = data_aoi[6].getData(dataDir)

GEDI data has 8 beams. So, we will check that all beams are in our file and print a list of the available beams.

gedi_h5_file = h5py.File(gedi_data, 'r')
gedi_keys = list(gedi_h5_file.keys())
gedi_beams = ['BEAM0000', 'BEAM0001', 'BEAM0010', 'BEAM0011', 'BEAM0101', 'BEAM0110', 'BEAM1000', 'BEAM1011']
gedi_beams_lst = []
for gedi_beam_name in gedi_keys:
    if gedi_beam_name in gedi_beams:
['BEAM0000', 'BEAM0001', 'BEAM0010', 'BEAM0011', 'BEAM0101', 'BEAM0110', 'BEAM1000', 'BEAM1011']

For each beam, we need to get all of the data and metrics associated with it. For this, we have a function that will gather all of the metrics we want and put them into a geopandas dataframe:

def get_gedi_df(gedi_h5_file, gedi_beam_name):
    gedi_beam = gedi_h5_file[gedi_beam_name]

    # Get location info.
    gedi_beam_geoloc = gedi_beam['geolocation']
    # Get land cover data.
    gedi_beam_landcover = gedi_beam['land_cover_data']

    gedi_beam_df = pandas.DataFrame(
            {'elevation_bin0'         : gedi_beam_geoloc['elevation_bin0'],
             'elevation_lastbin'      : gedi_beam_geoloc['elevation_lastbin'],
             'height_bin0'            : gedi_beam_geoloc['height_bin0'],
             'height_lastbin'         : gedi_beam_geoloc['height_lastbin'],
             'shot_number'            : gedi_beam_geoloc['shot_number'],
             'solar_azimuth'          : gedi_beam_geoloc['solar_azimuth'],
             'solar_elevation'        : gedi_beam_geoloc['solar_elevation'],
             'latitude_bin0'          : gedi_beam_geoloc['latitude_bin0'],
             'latitude_lastbin'       : gedi_beam_geoloc['latitude_lastbin'],
             'longitude_bin0'         : gedi_beam_geoloc['longitude_bin0'],
             'longitude_lastbin'      : gedi_beam_geoloc['longitude_lastbin'],
             'degrade_flag'           : gedi_beam_geoloc['degrade_flag'],
             'digital_elevation_model': gedi_beam_geoloc['digital_elevation_model'],
             'landsat_treecover'      : gedi_beam_landcover['landsat_treecover'],
             'modis_nonvegetated'     : gedi_beam_landcover['modis_nonvegetated'],
             'modis_nonvegetated_sd'  : gedi_beam_landcover['modis_nonvegetated_sd'],
             'modis_treecover'        : gedi_beam_landcover['modis_treecover'],
             'modis_treecover_sd'     : gedi_beam_landcover['modis_treecover_sd'],
             'beam'                   : gedi_beam['beam'],
             'cover'                  : gedi_beam['cover'],
             'master_frac'            : gedi_beam['master_frac'],
             'master_int'             : gedi_beam['master_int'],
             'num_detectedmodes'      : gedi_beam['num_detectedmodes'],
             'omega'                  : gedi_beam['omega'],
             'pai'                    : gedi_beam['pai'],
             'pgap_theta'             : gedi_beam['pgap_theta'],
             'pgap_theta_error'       : gedi_beam['pgap_theta_error'],
             'rg'                     : gedi_beam['rg'],
             'rh100'                  : gedi_beam['rh100'],
             'rhog'                   : gedi_beam['rhog'],
             'rhog_error'             : gedi_beam['rhog_error'],
             'rhov'                   : gedi_beam['rhov'],
             'rhov_error'             : gedi_beam['rhov_error'],
             'rossg'                  : gedi_beam['rossg'],
             'rv'                     : gedi_beam['rv'],
             'sensitivity'            : gedi_beam['sensitivity'],
             'stale_return_flag'      : gedi_beam['stale_return_flag'],
             'surface_flag'           : gedi_beam['surface_flag'],
             'l2a_quality_flag'       : gedi_beam['l2a_quality_flag'],
             'l2b_quality_flag'       : gedi_beam['l2b_quality_flag']})

    gedi_beam_gdf = gpd.GeoDataFrame(gedi_beam_df, crs='EPSG:4326',
    return gedi_beam_gdf

To access the data with this function, we can call the function for each beam id that we have:

BEAM0000 = get_gedi_df(gedi_h5_file, 'BEAM0000')
BEAM0001 = get_gedi_df(gedi_h5_file, 'BEAM0001')
BEAM0010 = get_gedi_df(gedi_h5_file, 'BEAM0010')
BEAM0011 = get_gedi_df(gedi_h5_file, 'BEAM0011')
BEAM0101 = get_gedi_df(gedi_h5_file, 'BEAM0101')
BEAM0110 = get_gedi_df(gedi_h5_file, 'BEAM0110')
BEAM1000 = get_gedi_df(gedi_h5_file, 'BEAM1000')
BEAM1011 = get_gedi_df(gedi_h5_file, 'BEAM1011')

Now we can look at the data in one of the dataframes (beams). We can see that there are 108583 GEDI shots (108583 entries in each column).

print('number of rows = ', len(BEAM0000))
   elevation_bin0  elevation_lastbin  height_bin0  height_lastbin  \
0      533.320205         532.870899 -9999.000000    -9999.000000
1      533.610735         533.161429 -9999.000000    -9999.000000
2      835.586133         790.954687    42.284134       -2.373195
3      822.361914         780.875651    38.280968       -3.229356
4      788.459662         760.602475    25.957445       -1.915897

         shot_number  solar_azimuth  solar_elevation  latitude_bin0  \
0  25370000300165164     -34.511044       -10.500773      51.821288
1  25370000300165165     -34.510292       -10.501078      51.821282
2  25370000300165166     -34.509689       -10.501341      51.821258
3  25370000300165167     -34.508926       -10.501648      51.821253
4  25370000300165168     -34.508163       -10.501955      51.821249

   latitude_lastbin  longitude_bin0  ...  rhov  rhov_error  rossg  \
0         51.821288     -127.096372  ...   0.6     -9999.0    0.5
1         51.821282     -127.095550  ...   0.6     -9999.0    0.5
2         51.821260     -127.094876  ...   0.6     -9999.0    0.5
3         51.821255     -127.094048  ...   0.6     -9999.0    0.5
4         51.821250     -127.093210  ...   0.6     -9999.0    0.5

            rv  sensitivity  stale_return_flag  surface_flag  \
0 -9999.000000   -14.502214                  1             1
1 -9999.000000    -8.689309                  1             1
2  3615.732178     0.965246                  0             1
3  4290.229980     0.969620                  0             1
4  2557.771240     0.943713                  0             1

   l2a_quality_flag  l2b_quality_flag                     geometry
0                 0                 0  POINT (-127.09637 51.82129)
1                 0                 0  POINT (-127.09555 51.82128)
2                 1                 1  POINT (-127.09485 51.82126)
3                 1                 1  POINT (-127.09403 51.82126)
4                 1                 1  POINT (-127.09320 51.82125)

[5 rows x 41 columns]
number of rows =  108583

Create the Subset and Display the Data

Before displaying this data we can do a spatial subset to remove the data outside of our AOI by doing a spatial subset. We use the Geopandas clip function to clip out the GEDI data based on the extent of our AOI. This reduces the size of the dataframe from 108583 rows to 508.

BEAM0000_sub = gpd.clip(BEAM0000, AOI)

We can display this subset of data over our AOI extent.

m = folium.Map(location=[38.5, -78], zoom_start=9, tiles='CartoDB positron')

geo_j = folium.GeoJson(data=AOI, style_function=lambda x: {'fillColor': 'orange'})

geo_g = folium.GeoJson(data=BEAM0000_sub,marker = folium.CircleMarker(radius = 1, # Radius in metres
                                           weight = 0, #outline weight
                                           fill_color = '#FF0000',
                                           fill_opacity = 1))
Make this Notebook Trusted to load map: File -> Trust Notebook

Now we have this subset of data, we can look at some of the GEDI metrics over our area of interest. The ‘rh100’ metric should give us the top of the canopy heights. What does this look like over Shenandoah National Park? We will look at the ground height and the ‘rh100’ above ground. The DEM metric is in meters (m) and the ‘rh100’ metric is in centimeters (cm) above the ground height. Therefore we must add the ‘rh100’ value to the ground height to get the total height of the tree for display purposes. The ‘rh100’ metric is also converted to meters (m) to normalize the units

latitude = BEAM0000_sub['latitude_lastbin']
rh100 = BEAM0000_sub['rh100']
ground = BEAM0000_sub['digital_elevation_model']

TreeHeight = ground + rh100/100

Finally, we can make a plot of the ground surface and the canopy heights above the ground surface. We can see the forest canopies in green above the topographically complex ground in brown.

plt.figure(figsize=(20, 5))
plt.scatter(latitude, TreeHeight, c='green')
plt.plot(latitude, ground, c='brown')
[<matplotlib.lines.Line2D at 0x7fa48b7eb7d0>]

Now you have this basic structure you can investigate some of the other metrics and GEDI beams to understand more about the data.