Adding Cloud-Optimized GeoTIFFs to the MAAP Biomass Earthdata Dashboard

Author(s): Aimee Barciauskas (Development Seed)

Date: Oct 14, 2021

Description: The following notebook steps through how to add a dataset to the MAAP Dashboard.

Note, there are 2 scenarios:

  1. Adding a single Cloud-Optimized GeoTIFF (COG), and

  2. Adding many distinct COGs as a “mosaic” with mosaicJSON.

High-level, the steps are:

  1. Inspect your Cloud-Optimized GeoTIFF(s) (COGs) to understand the best rescale and colormap name parameters. Optionally create a mosaic.

  2. Define a colormap. Colormaps provide mappings of data values to RGB values.

  3. Create a PR to the datasets repo to add or update your dataset.

The MAAP dashboard has 3 environments:

  1. Developer-in-test (DIT): https://biomass.dit.maap-project.org

  2. Staging: https://biomass.staging.maap-project.org

  3. Production: https://biomass.maap-project.org

These instructions will guide you towards adding your dataset to biomass.dit.maap-project.org. The MAAP Dashboard team will “promote” changes to staging and production periodically (release schedule forthcoming).

Run This Notebook

To access and run this tutorial within MAAP’s Algorithm Development Environment (ADE), please refer to the “Getting started with the MAAP” section of our documentation.

Disclaimer: it is highly recommended to run a tutorial within MAAP’s ADE, which already includes packages specific to MAAP, such as maap-py. Running the tutorial outside of the MAAP ADE may lead to errors.

Additional Resources

Importing and Installing Packages

To be able to run this notebook you’ll need the following requirements: - rasterio - rio-cogeo - requests - cogeo-mosaic - folium

If the packages below are not installed already, uncomment the following cell:

[1]:
# %pip install rasterio
# %pip install rio-cogeo
# %pip install requests
# %pip install cogeo-mosaic
# %pip install folium
[2]:
import glob
import json
import os
import matplotlib

import requests
from pprint import pprint
from cogeo_mosaic.mosaic import MosaicJSON
from cogeo_mosaic.backends import MosaicBackend
from folium import Map, TileLayer, WmsTileLayer

titiler_endpoint = "https://titiler.maap-project.org"

Step 1: Inspect Cloud-Optimized GeoTIFF(s)

In this step, we ensure that our data is valid, accessible, and looks as expected. We’ve included a helper function to translate MAAP ADE local paths to their respective S3 urls.

Accessing Files

[3]:
project_dir = "/projects/shared-buckets/<your_name>/<project_dir>"

# e.g.
project_dir = "/projects/shared-buckets/alexdevseed/landsat8/viz/"
[4]:
# Search for files to include, use recursive if nested folders (common in DPS output)
files = glob.glob(os.path.join(project_dir, "Landsat8*.tif"), recursive=False)
files = [os.path.basename(f) for f in files]
pprint(files[:10])

# Use the first product
_tif = files[0]


# Helper function
def local_to_s3(url):
    """A Function to convert local paths to s3 urls"""
    return url.replace("/projects/shared-buckets", "s3://maap-ops-workspace/shared")
['Landsat8_30542_comp_cog_2015-2020_dps.tif',
 'Landsat8_30543_comp_cog_2015-2020_dps.tif',
 'Landsat8_30822_comp_cog_2015-2020_dps.tif',
 'Landsat8_30823_comp_cog_2015-2020_dps.tif']
[5]:
%%bash -s "$project_dir" "$_tif"
rio cogeo validate $1/$2
/projects/shared-buckets/alexdevseed/landsat8/viz/Landsat8_30542_comp_cog_2015-2020_dps.tif is a valid cloud optimized GeoTIFF
[6]:
# Getting COG information
cog_info = requests.get(
    f"{titiler_endpoint}/cog/info",
    params={
        "url": f"{local_to_s3(project_dir)}{_tif}",
    },
).json()

bounds = cog_info["bounds"]
pprint(cog_info)
{'band_descriptions': [['b1', 'Blue'],
                       ['b2', 'Green'],
                       ['b3', 'Red'],
                       ['b4', 'NIR'],
                       ['b5', 'SWIR'],
                       ['b6', 'NDVI'],
                       ['b7', 'SAVI'],
                       ['b8', 'MSAVI'],
                       ['b9', 'NDMI'],
                       ['b10', 'EVI'],
                       ['b11', 'NBR'],
                       ['b12', 'NBR2'],
                       ['b13', 'TCB'],
                       ['b14', 'TCG'],
                       ['b15', 'TCW'],
                       ['b16', 'ValidMask'],
                       ['b17', 'Xgeo'],
                       ['b18', 'Ygeo']],
 'band_metadata': [['b1', {}],
                   ['b2', {}],
                   ['b3', {}],
                   ['b4', {}],
                   ['b5', {}],
                   ['b6', {}],
                   ['b7', {}],
                   ['b8', {}],
                   ['b9', {}],
                   ['b10', {}],
                   ['b11', {}],
                   ['b12', {}],
                   ['b13', {}],
                   ['b14', {}],
                   ['b15', {}],
                   ['b16', {}],
                   ['b17', {}],
                   ['b18', {}]],
 'bounds': [-117.10749852280769,
            50.78795362739066,
            -116.50936927974429,
            51.16389512140189],
 'colorinterp': ['gray',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined',
                 'undefined'],
 'count': 18,
 'driver': 'GTiff',
 'dtype': 'float32',
 'height': 1000,
 'maxzoom': 12,
 'minzoom': 9,
 'nodata_type': 'None',
 'overviews': [2, 4],
 'width': 1000}
[7]:
# Getting band information
cog_stats = requests.get(
    f"{titiler_endpoint}/cog/statistics",
    params={
        "url": f"{local_to_s3(project_dir)}{_tif}",
    },
).json()
pprint(cog_stats["b1"])
{'count': 1000000.0,
 'histogram': [[128480.0,
                807237.0,
                53374.0,
                7274.0,
                2056.0,
                743.0,
                370.0,
                299.0,
                129.0,
                38.0],
               [0.0,
                4814.5,
                9629.0,
                14443.5,
                19258.0,
                24072.5,
                28887.0,
                33701.5,
                38516.0,
                43330.5,
                48145.0]],
 'majority': 0.0,
 'masked_pixels': 0.0,
 'max': 48145.0,
 'mean': 7467.137024,
 'median': 7989.0,
 'min': 0.0,
 'minority': 11663.0,
 'percentile_2': 0.0,
 'percentile_98': 12191.0,
 'std': 2803.86812414882,
 'sum': 7467137024.0,
 'unique': 20318.0,
 'valid_percent': 100.0,
 'valid_pixels': 1000000.0}

Create Parameters for the TiTiler

These parameters will be pased to titiler_endpoint for visualization.

Note the values below: We’re setting the rescale equal to the selected band’s min,max values and selecting the gist_earth_r colormap. You should modify the colormap_name as makes sense for your dataset. This notebook includes a section on what colormaps are available and how to configure different types of colormaps and legends.

[8]:
band = "b1"
bidx = 1
rescale = f"{cog_stats[band]['min']},{cog_stats[band]['max']}"

params = {
    "tile_format": "png",
    "tile_scale": "1",
    "TileMatrixSetId": "WebMercatorQuad",
    "url": f"{local_to_s3(project_dir)}{_tif}",
    "bidx": 1,  # Select which band to use
    "resampling": "nearest",
    "rescale": rescale,
    "return_mask": "true",
    "colormap_name": "gist_earth_r",
}

Scenario 1: Adding a Single COG

Upload File

Only use the following steps if you only have one COG to share to the dashboard. If you want to create a mosaic from multiple COGs, skip Scenario 1 and go to Scenario 2.

If you haven’t already, upload the file to S3 and make note of the location. In this tutorial, we’re using a landsat8/visualization TIF already in S3 for the url parameter value.

Test COG with TiTiler and Folium

[9]:
response = requests.get(f"{titiler_endpoint}/cog/tilejson.json", params=params).json()
[10]:
m = Map(
    tiles="OpenStreetMap",
    location=((bounds[1] + bounds[3]) / 2, (bounds[0] + bounds[2]) / 2),
    zoom_start=cog_info["minzoom"],
)

tiles = TileLayer(tiles=response["tiles"][0], opacity=1, attr="USGS")

tiles.add_to(m)
m
[10]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Scenario 2: Adding Data from Multiple COGs by Creating a Mosaic

Many datasets are comprised of many tiles distributed spatially over the globe. In order to visualize them all together, we can use mosaicJSON to create a mosaic for the dynamic tiler API. The dynamic tiler API knows how to read this mosaicJSON and select which tiles to render based on the current zoom, x and y coordinates across spatially distinct COGs.

[11]:
tiles = [f"{local_to_s3(project_dir)}{file}" for file in files]
[12]:
mosaicdata = MosaicJSON.from_urls(tiles, minzoom=1, maxzoom=16)
mosaicdata
[12]:
MosaicJSON(mosaicjson='0.0.3', name=None, description=None, version='1.0.0', attribution=None, minzoom=1, maxzoom=16, quadkey_zoom=1, bounds=(-117.19773367251135, 50.19386902261471, -116.26013039328576, 51.16389512140189), center=(-116.72893203289856, 50.67888207200831, 1), tiles={'0': ['s3://maap-ops-workspace/shared/alexdevseed/landsat8/viz/Landsat8_30542_comp_cog_2015-2020_dps.tif', 's3://maap-ops-workspace/shared/alexdevseed/landsat8/viz/Landsat8_30543_comp_cog_2015-2020_dps.tif', 's3://maap-ops-workspace/shared/alexdevseed/landsat8/viz/Landsat8_30822_comp_cog_2015-2020_dps.tif', 's3://maap-ops-workspace/shared/alexdevseed/landsat8/viz/Landsat8_30823_comp_cog_2015-2020_dps.tif']}, tilematrixset=None, asset_type=None, asset_prefix=None, data_type=None, colormap=None, layers=None)

Using MosaicJSON with TiTiler

There are 2 options for using mosaicJSON with titiler:

  1. (Preferred) Post mosaicJSON to titiler mosaics endpoint and use the mosaicjson/mosaics endpoint for dynamic tiling.

  2. Upload mosaicJSON to S3 and pass the S3 url to the titiler mosaicjson/tiles endpoint.

Post MosaicJSON to TiTiler

[13]:
mosaic_links = requests.post(
    url=f"{titiler_endpoint}/mosaics",
    headers={
        "Content-Type": "application/vnd.titiler.mosaicjson+json",
    },
    json=mosaicdata.model_dump(exclude_none=True),
).json()

pprint(mosaic_links)

mosaic_id = mosaic_links["id"]
{'id': '1cecf064-1f2c-4adf-b7b3-7121fdc8f97d',
 'links': [{'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d',
            'rel': 'self',
            'title': 'Self',
            'type': 'application/json'},
           {'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d/mosaicjson',
            'rel': 'mosaicjson',
            'title': 'MosaicJSON',
            'type': 'application/json'},
           {'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d/tilejson.json',
            'rel': 'tilejson',
            'title': 'TileJSON',
            'type': 'application/json'},
           {'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d/tiles/{z}/{x}/{y}',
            'rel': 'tiles',
            'title': 'Tiles',
            'type': 'application/json'},
           {'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d/WMTSCapabilities.xml',
            'rel': 'wmts',
            'title': 'WMTS',
            'type': 'application/json'}]}
[14]:
tilejson_endpoint = list(
    filter(lambda x: x.get("rel") == "tilejson", dict(mosaic_links)["links"])
)
tilejson_endpoint
[14]:
[{'href': 'https://titiler.maap-project.org/mosaics/1cecf064-1f2c-4adf-b7b3-7121fdc8f97d/tilejson.json',
  'rel': 'tilejson',
  'type': 'application/json',
  'title': 'TileJSON'}]

Test Mosaic with TiTiler and Folium

[15]:
params = {
    "tile_format": "png",
    "bidx": bidx,
    "resampling": "nearest",
    "rescale": rescale,
    "return_mask": "true",
    "colormap_name": "viridis",
    "pixel_selection": "first",
}

r_te = requests.get(tilejson_endpoint[0]["href"], params=params).json()

tiles = TileLayer(tiles=f"{r_te['tiles'][0]}", opacity=1, attr="USGS")

tiles.add_to(m)
m
[15]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Step 2: Define a Color Map

By default, the image will be displayed in greyscale if no colormap_name parameter is passed to the titiler API. Guidance below is provided to help determine what a valid colormap_name might be and how to create a legend for the dashboard.

Dashboard ColorRamps & Legends

When using the dashboard, there 2 components for implementing a color scheme for your map. There is the map render and there is the legend.

Titiler used for Cloud Optimized Geotiff (COG) rendering accepts any color scheme from the python matplotlib library, and custom color formulas.

Available colormap_name values for titiler: above, accent, accent_r, afmhot, afmhot_r, autumn, autumn_r, binary, binary_r, blues, blues_r, bone, bone_r, brbg, brbg_r, brg, brg_r, bugn, bugn_r, bupu, bupu_r, bwr, bwr_r, cfastie, cividis, cividis_r, cmrmap, cmrmap_r, cool, cool_r, coolwarm, coolwarm_r, copper, copper_r, cubehelix, cubehelix_r, dark2, dark2_r, flag, flag_r, gist_earth, gist_earth_r, gist_gray, gist_gray_r, gist_heat, gist_heat_r, gist_ncar, gist_ncar_r, gist_rainbow, gist_rainbow_r, gist_stern, gist_stern_r, gist_yarg, gist_yarg_r, gnbu, gnbu_r, gnuplot, gnuplot2, gnuplot2_r, gnuplot_r, gray, gray_r, greens, greens_r, greys, greys_r, hot, hot_r, hsv, hsv_r, inferno, inferno_r, jet, jet_r, magma, magma_r, nipy_spectral, nipy_spectral_r, ocean, ocean_r, oranges, oranges_r, orrd, orrd_r, paired, paired_r, pastel1, pastel1_r, pastel2, pastel2_r, pink, pink_r, piyg, piyg_r, plasma, plasma_r, prgn, prgn_r, prism, prism_r, pubu, pubu_r, pubugn, pubugn_r, puor, puor_r, purd, purd_r, purples, purples_r, rainbow, rainbow_r, rdbu, rdbu_r, rdgy, rdgy_r, rdpu, rdpu_r, rdylbu, rdylbu_r, rdylgn, rdylgn_r, reds, reds_r, rplumbo, schwarzwald, seismic, seismic_r, set1, set1_r, set2, set2_r, set3, set3_r, spectral, spectral_r, spring, spring_r, summer, summer_r, tab10, tab10_r, tab20, tab20_r, tab20b, tab20b_r, tab20c, tab20c_r, terrain, terrain_r, twilight, twilight_r, twilight_shifted, twilight_shifted_r, viridis, viridis_r, winter, winter_r, wistia, wistia_r, ylgn, ylgn_r, ylgnbu, ylgnbu_r, ylorbr, ylorbr_r, ylorrd, ylorrd_r

Example 1: Class Based Known Colors

In this example, the raster represents classes of forest with 11 possible values. There are specific colors selected to correspond to each class. We combine the list of colors and the list of classes and format them for the legend parameter the dashboard needs.

https://github.com/MAAP-Project/dashboard-datasets-maap/blob/main/datasets/taiga-forest-classification.json

[16]:
colors = [
    "#5255A3",
    "#1796A3",
    "#FDBF6F",
    "#FF7F00",
    "#FFFFBF",
    "#D9EF8B",
    "#91CF60",
    "#1A9850",
    "#C4C4C4",
    "#FF0000",
    "#0000FF",
]
labels = [
    "Sparse & Uniform",
    "Sparse & Diffuse-gradual",
    "Sparse & Diffuse-rapid",
    "Sparse & Abrupt ",
    "Open & Uniform ",
    "Open & Diffuse-gradual",
    "Open & Diffuse-rapid",
    "Open & Abrupt",
    "Intermediate & Closed",
    "Non-forest edge (dry)",
    "Non-forest edge (wet)",
]

legend = [dict(color=colors[i], label=labels[i]) for i in range(0, len(colors))]
print(json.dumps(legend, indent=2))

# Copy and Paste the output below to your dashboard config.
[
  {
    "color": "#5255A3",
    "label": "Sparse & Uniform"
  },
  {
    "color": "#1796A3",
    "label": "Sparse & Diffuse-gradual"
  },
  {
    "color": "#FDBF6F",
    "label": "Sparse & Diffuse-rapid"
  },
  {
    "color": "#FF7F00",
    "label": "Sparse & Abrupt "
  },
  {
    "color": "#FFFFBF",
    "label": "Open & Uniform "
  },
  {
    "color": "#D9EF8B",
    "label": "Open & Diffuse-gradual"
  },
  {
    "color": "#91CF60",
    "label": "Open & Diffuse-rapid"
  },
  {
    "color": "#1A9850",
    "label": "Open & Abrupt"
  },
  {
    "color": "#C4C4C4",
    "label": "Intermediate & Closed"
  },
  {
    "color": "#FF0000",
    "label": "Non-forest edge (dry)"
  },
  {
    "color": "#0000FF",
    "label": "Non-forest edge (wet)"
  }
]

Example 2: Discrete ColorRamp

In this example, the range of values is known, but the color scale has many non-sequential colors. Starting with the premade color list, we create a continuous color ramp that uses the known colors as stops points. Arbitrarly 12 breaks looked decent in the dashboard legend so we split it into 12 discrete colors. Then combine the list of values and colors into the correct json syntax.

https://github.com/MAAP-Project/dashboard-datasets-maap/blob/main/datasets/ATL08.json

[17]:
forest_ht = matplotlib.colors.LinearSegmentedColormap.from_list(
    "forest_ht",
    [
        "#636363",
        "#FC8D59",
        "#FEE08B",
        "#FFFFBF",
        "#D9EF8B",
        "#91CF60",
        "#1A9850",
        "#005A32",
    ],
    12,
)
cols = [matplotlib.colors.to_hex(forest_ht(i)) for i in range(forest_ht.N)]

cats = range(0, 25, (25 // len(cols)))
legend = [[cats[i], cols[i]] for i in range(0, len(cols))]
text = json.dumps(legend, separators=(",", ": "))

print(text.replace("],[", "],\n["))

# Copy and Paste the output below to your dashboard config.
[[0,"#636363"],
[2,"#c47e5d"],
[4,"#fda467"],
[6,"#fed886"],
[8,"#fff1a7"],
[10,"#f8fcb6"],
[12,"#e0f294"],
[14,"#b8e077"],
[16,"#86ca5f"],
[18,"#3aa754"],
[20,"#118145"],
[22,"#005a32"]]

Example 3: Continuous ColorRamp

In this example, we are using a built in ColorRamp from matplotlib. So we just need to extract enough colors to fill the legend adequately, and convert the colors to hex codes.

https://github.com/MAAP-Project/dashboard-datasets-maap/blob/main/datasets/topo.json

[18]:
cmap_name = "gist_earth_r"
cmap = matplotlib.cm.get_cmap(cmap_name, 12)
cols = [matplotlib.colors.to_hex(cmap(i)) for i in range(cmap.N)]
print(cols)

# Copy and Paste the output below to your dashboard config.
['#fdfbfb', '#e3c3b5', '#c9a87a', '#bab060', '#9db059', '#76a652', '#45994a', '#3a8c66', '#2e7c7f', '#1f567b', '#0f2577', '#000000']

Step 3: Create and Submit Pull Request to Add Dashboard Dataset

[19]:
# This example is for a continuous color ramps
dataset_type = "raster"
dataset_id = "paraguay-estimated-biomass"
dataset_name = "Estimated Biomass in Paraguay"

stops = cols
legend_type = "gradient-adjustable"
info = "Estimated biomass within 6km grids."

sample_bidx = 1
sample_band_min = 0
sample_band_max = 4000
parameters = (
    f"colormap_name={cmap_name}&rescale={sample_band_min},{sample_band_max}&bidx={bidx}"
)
[20]:
# Single COG
tiles_link = f"{titiler_endpoint}/cog/tiles/{{z}}/{{x}}/{{y}}.png?url=s3://example-bucket/path/to/object/example.tif&{parameters}"

# Mosaic
mosaic_link = (
    f"{titiler_endpoint}/mosaic/{mosaic_id}/tiles/{{z}}/{{x}}/{{y}}?{parameters}"
)
[21]:
dataset_dict = {
    "id": dataset_id,
    "name": dataset_name,
    "type": dataset_type,
    "swatch": {"color": "#6976d7", "name": "Moody Blue"},
    "source": {"type": dataset_type, "tiles": [tiles_link]},
    "legend": {
        "type": legend_type,
        "min": sample_band_min,
        "max": sample_band_max,
        "stops": stops,
    },
    "info": info,
}
print(json.dumps(dataset_dict, indent=4))
{
    "id": "paraguay-estimated-biomass",
    "name": "Estimated Biomass in Paraguay",
    "type": "raster",
    "swatch": {
        "color": "#6976d7",
        "name": "Moody Blue"
    },
    "source": {
        "type": "raster",
        "tiles": [
            "https://titiler.maap-project.org/cog/tiles/{z}/{x}/{y}.png?url=s3://example-bucket/path/to/object/example.tif&colormap_name=gist_earth_r&rescale=0,4000&bidx=1"
        ]
    },
    "legend": {
        "type": "gradient-adjustable",
        "min": 0,
        "max": 4000,
        "stops": [
            "#fdfbfb",
            "#e3c3b5",
            "#c9a87a",
            "#bab060",
            "#9db059",
            "#76a652",
            "#45994a",
            "#3a8c66",
            "#2e7c7f",
            "#1f567b",
            "#0f2577",
            "#000000"
        ]
    },
    "info": "Estimated biomass within 6km grids."
}

Clone the Datasets Repository

git clone git@github.com:MAAP-Project/biomass-dashboard-datasets.git
cd biomass-dashboard-datasets
git checkout -b feature/dataset-name
# select and copy json above
echo <copied_json> >> datasets/paraguay-estimated-biomass.json

Add JSON to Product or Country Pilot

In country_pilots/paraguay/country_pilot.json:

{
    "id": "paraguay",
    "label": "Paraguay",
    //...
    "datasets": [
        {
            "id": "paraguay-forest-mask"
        },
        {
            "id": "paraguay-tree-cover"
        },
        {
            "id": "paraguay-estimated-biomass"
        }
    ]
}

Add Content to summary.html

There should be a summary.html file corresponding to the product or country pilot you are working on, for example: country_pilots/paraguay/summary.html. Add or modify content in that file as appropriate.

Add Dataset(s) to config.yml

In config.yml:

DATASETS:
- paraguay-estimated-biomass.json

Create Pull Request

Once you have added the dataset json file and summary content, submit a PR to https://github.com/MAAP-Project/biomass-dashboard-datasets. A member of the data team will review the PR and when it is merged your content will appear in biomass.dit.maap-project.org.