{ "cells": [ { "cell_type": "markdown", "id": "c3b82581-05bd-43fa-b8f7-7097fc5ca90f", "metadata": {}, "source": [ "# GEDI S3 Bucket Access at LPDAAC (BETA)\n", "Authors: Alex Mandel (Development Seed), Brian Freitag (NASA MSFC), Jamison French (Development Seed)\n", "\n", "Description: In this tutorial, we demonstrate how to use transform HTTPS links into their corresponding S3 links to retrieve GEDI data hosted by the Land Processes Distributed Active Archive Center (LP DAAC).\n", "\n", "***This tutorial demonstrates a temporary workaround with the expectation that direct access links for LPDAAC GEDI data will eventually be available through NASA CMR***." ] }, { "cell_type": "markdown", "id": "550185b0-4eb0-4d6b-8b3c-23e15551ccc2", "metadata": {}, "source": [ "## Run This Notebook\n", "To access and run this tutorial within MAAP's Algorithm Development Environment (ADE), please refer to the [\"Getting started with the MAAP\"](https://docs.maap-project.org/en/latest/getting_started/getting_started.html) section of our documentation.\n", "\n", "Disclaimer: it is highly recommended to run a tutorial within MAAP's ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors." ] }, { "cell_type": "markdown", "id": "71cfb64c-7b13-404d-a1c2-c559953bf5db", "metadata": {}, "source": [ "## Additional Resources\n", "- [Searching Granules in CMR](../search/granules.ipynb)\n", "- [Searching Collections in CMR](../search/granules.ipynb)" ] }, { "cell_type": "markdown", "id": "5b2ea4b0-afc3-4955-8662-779be27374fe", "metadata": {}, "source": [ "## Importing Packages\n", "If the packages below are not installed already, uncomment the following cell" ] }, { "cell_type": "code", "execution_count": null, "id": "0a3e255a-40e1-425c-8e16-c637b8600708", "metadata": { "tags": [] }, "outputs": [], "source": [ "# %pip install h5py fsspec s3fs --quiet" ] }, { "cell_type": "code", "execution_count": 2, "id": "d1cb3d61-8aa0-4b3e-b865-24519a6eb315", "metadata": { "tags": [] }, "outputs": [], "source": [ "import h5py\n", "import boto3\n", "import botocore\n", "import fsspec\n", "from maap.maap import MAAP\n", "\n", "maap = MAAP(maap_host=\"api.maap-project.org\")" ] }, { "cell_type": "markdown", "id": "be5e4cad-4140-4a04-9eca-78f7cafe3fa0", "metadata": {}, "source": [ "## Searching the Data\n", "\n", "We'll start by gathering a sample list of granules from the GEDI L2A collection. The HTTPS links we're after are nested within the granule object." ] }, { "cell_type": "code", "execution_count": 3, "id": "9a54ac1a-05d8-4e38-b42c-01b4853d467e", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "https://e4ftl01.cr.usgs.gov//GEDI_L1_L2/GEDI/GEDI02_A.002/2019.04.18/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5\n" ] } ], "source": [ "results = maap.searchGranule(\n", " concept_id=\"C1908348134-LPDAAC_ECS\", # GEDI-L2A\n", " cmr_host=\"cmr.earthdata.nasa.gov\",\n", " limit=10,\n", ")\n", "\n", "# Download URL of GEDI L2A product\n", "print(results[0].getDownloadUrl())" ] }, { "cell_type": "markdown", "id": "f137bfc2-84c2-4f3f-ae2e-1716deaaaea0", "metadata": {}, "source": [ "## Converting the Paths\n", "We'll create a helper function to handle the link conversions to AWS S3 links." ] }, { "cell_type": "code", "execution_count": 4, "id": "77fbc001-d7ba-4145-b54d-5d996eb0b745", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "'s3://lp-prod-protected/GEDI02_A.002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002/GEDI02_A_2019108002012_O01959_01_T03909_02_003_01_V002.h5'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def lpdaac_gedi_https_to_s3(url):\n", " dir_comps = url.split(\"/\")\n", " return f\"s3://lp-prod-protected/{dir_comps[6]}/{dir_comps[8].strip('.h5')}/{dir_comps[8]}\"\n", "\n", "\n", "# Sample\n", "lpdaac_gedi_https_to_s3(results[0].getDownloadUrl())" ] }, { "cell_type": "markdown", "id": "3dfe7c62-4031-4437-9faa-d6c99f5a1e76", "metadata": {}, "source": [ "## Accessing the Data\n", "We'll fetch temporary S3 credentials for LPDAAC data and then view the BEAM groups of the first GEDI link in our results." ] }, { "cell_type": "code", "execution_count": 5, "id": "a7fc175b-2694-4fb4-aeba-48674b5f57de", "metadata": { "tags": [] }, "outputs": [], "source": [ "credentials = maap.aws.earthdata_s3_credentials(\n", " 'https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials'\n", ")\n", "\n", "s3 = fsspec.filesystem(\n", " \"s3\",\n", " key=credentials['accessKeyId'],\n", " secret=credentials['secretAccessKey'],\n", " token=credentials['sessionToken']\n", ")" ] }, { "cell_type": "code", "execution_count": 6, "id": "789932d7-05ba-4fa2-a356-7a52ca270b93", "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "with s3.open(lpdaac_gedi_https_to_s3(results[0]._location), \"rb\") as f:\n", " gedi_data = h5py.File(f, \"r\")\n", " print(gedi_data.keys())" ] }, { "cell_type": "code", "execution_count": null, "id": "0cdfb29b-b2ca-4f7b-9feb-442b0d9b5733", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 5 }