GEDI_L4A Subset and Visualize

Authors: Chuck Daniels (DevSeed), Jamison French (DevSeed), Anish Bhusal (UAH), Sumant Jha (MSFC/USRA), Alex Mandel (DevSeed)

Date: November 14, 2022

Description: In this tutorial, we will use a GeoJSON to create an area of interest (AOI) and use it in MAAP’s GEDI Subsetter. We will then visualize the output file created by the subsetter by plotting elevation contours.

Run This Notebook

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the “Getting started with the MAAP” section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.

About the Data

GEDI L4A Footprint Level Aboveground Biomass Density, Version 2

This dataset provides Global Ecosystem Dynamics Investigation (GEDI) Level 4 (L4) data, which has the purpose of providing mean aboveground biomass density (AGBD) and consists of the GEDI_L4A and GEDI_L4B collections. GEDI L4A contains predictions of AGBD and estimates of the prediction standard error. GEDI is attached to the International Space Station (ISS) and collects data globally between 51.6° N and 51.6° S latitudes at the highest resolution and densest sampling of any light detection and ranging (lidar) instrument in orbit to date; specifically, GEDI L4A data has a spatial resolution of 25m. Source: GEDI_L4A Version 2 User Guide

Additional Resources

GEDI_L4A Version 2 Data Set Landing Page
The GEDI Website
Earthdata Search
GEDI Subsetter DOI: 10.5281/zenodo.10019412

[Optional] Install Python Packages

This notebook contains some cells marked as optional, meaning that you can use this notebook without necessarily running such cells.

However, if you do wish to run the optional cells, you must install the following Python packages, which might not already be installed in your environment:

geopandas: for reading your AOI (GeoJson file), as well as for reading the job output (GeoPackage file containing the subset)
contextily: for visually verifying your AOI
backoff: for repeatedly polling the job status (after submission) until the job has been completed (either successfully or not)
folium: for visualizing your data on a Leaflet map
geojsoncontour: for converting your matplotlib contour plots to geojson

[ ]:

# Uncomment the following lines to install these packages if you haven't already.
# !pip install geopandas
# !pip install contextily
# !pip install backoff
# !pip install folium
# !pip install geojsoncontour

A job can be submitted without these packages, but installing them in order to run the optional cells may make it more convenient for you to visually verify both your AOI and the subset output produced by your job.

Obtain Username

[18]:

from maap.maap import MAAP

maap = MAAP(maap_host="api.maap-project.org")
username = maap.profile.account_info()["username"]
username

WARNING:maap.maap:Unable to load config file from source maap.cfg
WARNING:maap.maap:Unable to load config file from source ./maap.cfg
WARNING:maap.maap:Unable to load config file from source /projects/maap.cfg

[18]:

'smk0033'

Define the Area of Interest

You may use either a publicly available GeoJSON file for your AOI, such as those available at geoBoundaries, or you may create a custom GeoJSON file for your AOI. The following 2 subsections cover both cases.

Using a geoBoundary GeoJSON File

If your AOI is a publicly available geoBoundary, you can obtain the URL for the GeoJSON file using the function below. You simply need to supply an ISO3 value and a level. To find the appropriate ISO3 and level values, see the table on the geoBoundaries site.

[19]:

import requests


def get_geo_boundary_url(iso3: str, level: int) -> str:
    response = requests.get(
        f"https://www.geoboundaries.org/api/current/gbOpen/{iso3}/ADM{level}"
    )
    response.raise_for_status()
    return response.json()["gjDownloadURL"]


# If using a geoBoundary, uncomment the following assignment, supply
# appropriate values for `<iso3>` and `<level>`, then run this cell.

# Example (Gabon level 0): get_geo_boundary("GAB", 0)

# aoi = get_geo_boundary_url("<iso3>", <level>)

Using a Custom GeoJSON File

Alternatively, you can make your own GeoJSON file for your AOI and place it within your my-public-bucket folder within the ADE.

Based upon where you place your GeoJSON file under my-public-bucket, you can construct the URL for a job’s aoi input value.

For example, if the relative path of your AOI GeoJSON file under my-public-bucket is path/to/my-aoi.geojson (avoid using whitespace in the path and filename), the URL you would supply as the value of a job’s aoi input would be the following (where {username} is replaced with your username as output from the previous section):

f"https://maap-ops-workspace.s3.amazonaws.com/shared/{username}/path/to/my-aoi.geojson"`

If this is the case, use the cell below.

[20]:

#aoi = f"https://maap-ops-workspace.s3.amazonaws.com/shared/{username}/langtang_np.geojson"

#for your convenience you can use this geoJSON file but if you have your own geojson, use the commented link as example format
aoi = f"https://maap-ops-workspace.s3.amazonaws.com/shared/anisbhsl/langtang_np.geojson"

This example uses the AOI of Gosaikunda Lake region inside Langtang National Park. You can also create your own GeoJSON file for your AOI using sites like geojson.io

Gosaikunda

[Optional] Visually Verify your AOI

If you want to visually verify your AOI before proceeding, you may run the following cell, if you have the geopandas and contextily Python packages installed.

[21]:

try:
    import geopandas as gpd
    import contextily as ctx
except:
    print(
        "If you wish to visually verify your AOI, "
        "you must install the `geopandas` and `contextily` packages."
    )
else:
    aoi_gdf = gpd.read_file(aoi)
    aoi_epsg4326 = aoi_gdf.to_crs(epsg=4326)
    ax = aoi_epsg4326.plot(figsize=(10, 5), alpha=0.3, edgecolor="red")
    ctx.add_basemap(ax, crs=4326)

../../_images/science_GEDI_GEDI_L4A_15_0.png

Submit a Job

When supplying input values for a GEDI subsetting job, to use the default value for a field (where indicated), use a dash ("-") as the input value.

aoi (required): URL to a GeoJSON file representing your area of interest, as explained above.
doi: Digital Object Identifier (DOI) of the GEDI collection to subset, or a logical name representing such a DOI. Valid logical names: L1B, L2A, L2B, L4A
columns: Comma-separated list of column names to include in the output file.
query: Query expression for subsetting the rows in the output file.
limit: Maximum number of GEDI granule data files to download (among those that intersect the specified AOI). (Default: 10000)

It is recommended to use maap-dps-worker-32gb queues when submitting a job with a large aoi.

[22]:

inputs = dict(
    aoi=aoi,
    doi="L4A",
    lat="lat_lowestmode",
    lon="lon_lowestmode",
    beams="coverage",
    columns="agbd, agbd_se, sensitivity, geolocation/sensitivity_a2, elev_lowestmode",
    query="l2_quality_flag == 1 and l4_quality_flag == 1 and sensitivity > 0.95 and `geolocation/sensitivity_a2` > 0.95",
    limit=10,
    temporal="-",
    output="gedi_subset.gpkg"
)

result = maap.submitJob(
    identifier="gedi-subset",
    algo_id="gedi-subset",
    version="0.6.0",
    queue="maap-dps-worker-32gb",
    username=username,
    **inputs,
)

job_id = result.id
job_id or result

[22]:

'72fca5ba-935a-49a2-802f-1dcfd3a5628c'

Get the Job’s Output File

Now that the job has been submitted, we can use the job_id to check the job status until the job has been completed.

[23]:

from urllib.parse import urlparse

def job_status_for(job_id: str) -> str:
    return maap.getJobStatus(job_id)

def job_result_for(job_id: str) -> str:
    return maap.getJobResult(job_id)[0]

def to_job_output_dir(job_result_url: str) -> str:
    return f"/projects/my-private-bucket/{job_result_url.split(f'/{username}/')[1]}"

If you have installed the backoff Python package, running the following cell will automatically repeatedly check your job’s status until the job has been completed. Otherwise, you will have to manually repeatedly rerun the following cell until the output is either 'Succeeded' or 'Failed'.

[24]:

try:
    import backoff
except:
    job_status = job_status_for(job_id)
else:
    # Check job status every 2 minutes
    @backoff.on_predicate(
        backoff.constant,
        lambda status: status not in ["Deleted", "Succeeded", "Failed"],
        interval=120,
    )
    def wait_for_job(job_id: str) -> str:
        return job_status_for(job_id)

    job_status = wait_for_job(job_id)

job_status

INFO:backoff:Backing off wait_for_job(...) for 0.9s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 18.1s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 49.5s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 6.8s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 42.4s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 26.7s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 86.6s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 117.0s (Accepted)
INFO:backoff:Backing off wait_for_job(...) for 17.9s (Running)
INFO:backoff:Backing off wait_for_job(...) for 95.7s (Running)

[24]:

'Succeeded'

[25]:

assert job_status == "Succeeded", (
    job_result_for(job_id)
    if job_status == "Failed"
    else f"Job {job_id} has not yet completed ({job_status}). Rerun the prior cell."
)

output_url = job_result_for(job_id)
output_dir = to_job_output_dir(output_url)
output_file = f"{output_dir}/gedi_subset.gpkg"
print(f"Your subset results are in the file {output_file}")

Your subset results are in the file /projects/my-private-bucket/dps_output/gedi-subset/0.6.0/2023/06/27/20/05/21/764642/gedi_subset.gpkg

[Optional] Visually Verify the Results

If you installed the geopandas Python package, you can visually verify the output file by running the following cell.

[26]:

try:
    import geopandas as gpd
    import matplotlib.pyplot as plt
except:
    print(
        "If you wish to visually verify your output file, "
        "you must install the `geopandas` package."
    )
else:
    gedi_gdf = gpd.read_file(output_file)
    print(gedi_gdf.head())
    sensitivity_colors = plt.cm.get_cmap("viridis_r")
    gedi_gdf.plot(markersize = 0.1)

                                            filename  \
0  GEDI04_A_2020064181434_O06951_02_T04323_02_002...
1  GEDI04_A_2020064181434_O06951_02_T04323_02_002...
2  GEDI04_A_2020064181434_O06951_02_T04323_02_002...
3  GEDI04_A_2020064181434_O06951_02_T04323_02_002...
4  GEDI04_A_2020064181434_O06951_02_T04323_02_002...

   geolocation/sensitivity_a2  sensitivity  elev_lowestmode    agbd_se  \
0                    0.959091     0.959091      3256.992432  11.047880
1                    0.968629     0.968629      3282.666748  11.057747
2                    0.962079     0.962079      3314.994141  11.052886
3                    0.962610     0.962610      3351.686035  11.045983
4                    0.968436     0.968436      3436.938721  11.047858

         agbd                   geometry
0  169.741974  POINT (85.34628 28.06473)
1  235.977890  POINT (85.34667 28.06512)
2  189.141037  POINT (85.34705 28.06550)
3  180.132187  POINT (85.34742 28.06589)
4  191.731232  POINT (85.34817 28.06666)

../../_images/science_GEDI_GEDI_L4A_24_1.png

Generate Contour Lines

Create a lat, lon mesh grid with elevation as a depth parameter. As shown in the plot above, the lines don’t seem smooth. So we can apply linear or ‘cubic` interpolation to smoothen those missing points.

[27]:

geometry = gedi_gdf["geometry"]
elevation=gedi_gdf["elev_lowestmode"]

[28]:

lon = geometry.x
lat = geometry.y

[29]:

import numpy as np

x=np.linspace(min(lon), max(lon), 1000)
y=np.linspace(min(lat), max(lat), 1000)

[30]:

from scipy.interpolate import griddata

x_mesh, y_mesh = np.meshgrid(x,y)

You may experiment with nearest, linear, and cubic interpolation methods to see which gives more smooth results.

[31]:

#grid the elevation
z_mesh = griddata((lon, lat), elevation, (x_mesh, y_mesh), method='linear')

[32]:

colors=['blue','royalblue', 'navy','pink',  'mediumpurple',  'darkorchid',  'plum',  'm', 'mediumvioletred', 'palevioletred', 'crimson',
         'magenta','pink','red','yellow','orange', 'brown','green', 'darkgreen']
levels=len(colors)
contourf = plt.contourf(x_mesh, y_mesh, z_mesh, levels, alpha=0.5,  colors=colors, linestyles='None', vmin=elevation.min(), vmax=elevation.max())

../../_images/science_GEDI_GEDI_L4A_32_0.png

ContourPlot

Now we need to plot this contour into an interactive map for better visualization.

Plot the Contour Lines in Folium

You may need to install geojsoncontour, mapclassify, and folium, if you don’t already have them installed. We need to convert this contourf into geoJSON format.

[33]:

import folium
from folium import plugins
import branca
import geojsoncontour

[34]:

geojson = geojsoncontour.contourf_to_geojson(
    contourf=contourf,
    min_angle_deg=3.0,
    ndigits=5,
    stroke_width=1,
    unit='ft',
    fill_opacity=0.1,
)

[35]:

#create map view
m = folium.Map([lat.mean(), lon.mean()], zoom_start=12, tiles="OpenStreetMap")

folium.GeoJson(
    geojson,
    style_function=lambda x:{
        'color': x['properties']['stroke'],
        'weight':    x['properties']['stroke-width'],
        'fillColor': x['properties']['fill'],
        'opacity':   0.5,
    }
).add_to(m)

cm = branca.colormap.LinearColormap(colors, vmin=elevation.min(), vmax=elevation.max()).to_step(levels)
cm.caption='Elevation (in m)'
m.add_child(cm)

#legend
plugins.Fullscreen(position='topright', force_separate_button=True).add_to(m)

[35]:

<folium.plugins.fullscreen.Fullscreen at 0x7f356182c4c0>

[36]:

[36]:

Make this Notebook Trusted to load map: File -> Trust Notebook

Now you have an interactive visualization of a contour plot.

ContourMAP