Navigating the GEDI Ecosystem: Products & Structures

In this section:

Short descriptions of each GEDI product and its available metrics or derived products.
A breakdown of how the GEDI data products (HDF5, gridded TIFs, and others) are searchable, organized, and formatted as it is hosted by NASA LP DAAC and ORNL DAAC.

All free and open source GEDI data are published to the NASA DAACs. Documentation found in the EarthData catalog includes information about theory, methods, calibration, validation, formatting, and related resources. Ancillary data from other remote sensing or other spatial data are also available within each product for researchers to better understand the resulting GEDI data product metrics or estimations. Users should check for the latest versions/updates to the data product they wish to use, since much of the data continues to be iteratively updated according to ongoing research and development. Updates may include improved accuracies, introducing additional quality metric parameters, extending or adding results for new or aggregated time periods. Newer data products (not yet released) also continue to be published. Information about updates are tracked on the GEDI Mission webpage, and the corresponding LP and ORNL DAACs and EarthData access data catalog where all versions are hosted.

Choosing a Data Product to Work With

Earth’s ecosystems are, of course, incredibly diverse. The GEDI mission is truly a first, of hopefully many more spaceborne missions, attempting to capture the complexity of vegetation heights and structures across the globe. The challenge lies not only in evaluating the experimental value of vegetation optimized spaceborne lidar across global ecosystems, but also in the fact that 3-dimensional datasets (airborne, UAV, terrestrial lidar, or other lidar) and optimal related field data, are not abundantly available across the globe. Iterative revisions while navigating the limitations of GEDI’s sampling nature, orbital path, and capabilities is a balancing act between the mission’s science requirements and application needs. Especially within the current ecosystem of sparsely or infrequently surveyed vegetated areas.

Despite its challenges, active development and user feedback continue to improve and enhance the data products and its contributions to society. In fact, GEDI has been used to calibrate and validate satellite missions, like NISAR. It is qualified to validate other datasets and used directly as training data for successful modelling and mapping. Users should take careful consideration of their application context regarding the capabilities of GEDI over local areas. This involves weighing the influence of systematic, environmental, and surface qualities alongside the advantages and limitations across each product level.

Spatial extent and observational time period:

First and foremost, the study area must fall within the latitude and longitude extent at 51.6° N and 51.6° S. GEDI’s operational periods lie between April 4, 2019 - March 16, 2023 and April 26, 2024 - present, and is planned to be operational through 2030. A significant period of no data when the sensor was in hibernation is between March 16, 2023 - April 26, 2024. Additionally, consider regions or forest types with large coverage gaps due to GEDI’s orbital pattern on the ISS, and sampling design. The footprints are sampled about every 60m along track, and each track is around 600m apart. Large reductions in spatial coverage could also lead to underrepresentation, especially depending on the impact of error mitigation techniques applied.

Temporal resolution:

GEDI repeat coverage is variable and non-exact. It is unlikely that individual footprint will precisely overlap over the same area, though the likelihood increases over time. The sensor is placed on the International Space Station with a non sun-synchronous orbit leaving it without global coverage or fixed orbital paths. The data is collected at different times of day and without a standard schedule making location specific observations repeats. The amount of observations tends to decrease as it approaches the equator.

Spatial resolution:

GEDI’s 25m footprint spatial resolution plays a critical role in its ability to resolve vegetation heights, structure, and biomass. Non-spaceborne lidar systems can typically achieve anywhere from a few centimeters to 1-meter scales. The higher resolution footprint data does not necessarily imply higher accuracy. Several derived products based on the footprint data present valuable products at 100m to 1km scales. These products aggregate GEDI footprint within the grid cell and infer the average metric value within that cell. This extrapolates information from GEDI to the extent of the grid, but does not provide wall-to-wall coverage and can over-generalize estimations, depending on the user’s objective. To generate wall-to-wall coverage products, GEDI samples will have to be combined with other data, a step that requires its own review of best methods for data fusion.

Types of GEDI data products

Generally, footprint level products are beneficial for custom and localized data retrievals while higher level products are more readily usable ecological products, each with their own assumptions and processing catered to the original algorithm during product generation and its intended application.

L1B geolocated waveform data may require more advanced expertise in utilizing waveform data, and understanding the influence of certain processing parameters when calculating metrics from the waveform itself.
- Useful for custom canopy profiling, waveform based algorithms, and calibration with other lidar systems.
Footprint level metrics (L2A, L2B, L4A, L4C) may be more susceptible to geolocation error and inconsistent sample density either spatially, or temporally. When retrieving the datasets directly, many invalid data are available, so the user may need to prepare the data for analysis and adopt preliminary investigation of quality controls that are specialized to the behavior of the data and metric over specific study areas.
- Footprints provide samples that can be used as inputs for wall-to-wall models or calibration and validation at ~25m resolution. These samples are highly customizable and temporally specific. The data are either calculated metrics, indices, or other estimations.
Gridded and higher level derived products have coarser resolutions and therefore potentially overlook finer disturbances or surface detail. These products are based on the footprint level data, and may adopt a strict set of quality filtering. Additionally, many of these products are temporally aggregated. The assumptions made when deriving these products may not be appropriate for certain applications.
- Gridded products are readily available for regional, continental, or global trend analysis and some can offer time series inputs.
Derived products (compared to direct calculation of metrics) are higher level products where GEDI is used in tandem with other data to expand upon desired vegetation or height information.
- These products can be useful for rapid mapping and policy reporting for large-scale ecosystem and carbon monitoring.

Errors and uncertainty:

Uncertainty from the footprint level data largely mitigated in higher levels and derived products. Gridding, pixel based errors and modelling errors are also assessed. Decisions around the uncertainties are made for the final published products and strongly relate to the purpose of the dataset (global coverage, generalized estimations, cumulative time period observations). These decisions may not be entirely appropriate for more localized study areas due to geographic or ecosystem based assumptions or where time specific data is needed. With the right foundational knowledge, the user can test and make the best use of either the footprint and gridded products using the provided uncertainty metrics and processing parameters included to facilitate evaluation and specified selection for any given application. Currently, there is no “one-size-fits all” for selecting optimal results globally or locally, and such decisions can highly depend on the metric being used and over which land cover is being observed. Beam type, beam sensitivity, and the expected structural complexity are some of the strongest indicators of valid waveform signal interpretation and ample ground detection. Additional quality considerations may need to be applied, or combinations of multiple GEDI products can improve outputs depending on the vegetation type. Integrating robust models (Machine Learning or Deep Learning) or producing uncertainty maps can provide valuable insights for reducing errors in the existing data.

Consider combining GEDI with other lidar sources:

Combining GEDI with other lidar data can be particularly useful for calibrating the waveforms to reduce systematic errors. However, the availability of other lidar sources can greatly vary across regions and time periods. Airborne, UAV, and terrestrial lidar are rather expensive to collect and are often heavy datasets requiring large storage and computation capacities. It can be difficult to find other lidar sources, especially those that are free and open source. Additional skills will be required for further processing these data should they not already contain metrics corresponding to GEDI for comparison. This presents a challenge when considering integrating GEDI with these data. Studies on ecosystem using lidar based methods may allow for at most a 2-year gap between lidar datasets depending on the topic being studied and relative expected spatial and temporal changes over the landscape.

What are the Data Products and How are Each Formatted?

HDF5 File Formats

What is an HDF5 file?

GEDI data is distributed in Hierarchical Data Format version 5 (short file extension name as ‘.h5’). This is a versatile file format designed for storing and organizing large amounts of numerical data. It efficiently handles complex, multi-dimensional datasets like those found in scientific computing, machine learning, and data analysis.

HDF5 files are organized like a file system with a hierarchical structure. Arrays of different types, shapes, and sizes are stored with descriptive metadata.

Groups: Similar to directories/folders, used to organize data
Datasets: The actual data arrays (similar to files)
Attributes: Metadata attached to groups or datasets
Root group: The top-level group (like the root directory)

File naming conventions for the HDF5 product files

Each product file name for the footprint level datasets (GEDI01_B, GEDI02_A, GEDI02_B, GEDI04_A, GEDI04_C) are organized the same.

Example from the L2A user guide:

GEDI02_A_2019108185228_O01971_03_T00922_02_003_01_V002.h5 indicates:

GEDI02_A = Product Short Name
2019108 = Julian Date of Acquisition in YYYYDDD
185228 = Hours, Minutes and Seconds of Acquisition (HHMMSS)
O01971 = O = Orbit, 01971 = Orbit Number
03 = Sub-Orbit Granule Number (1-4)
T00922 = T = Track, 00922 = Track Number
02 = Positioning and Pointing Determination System (PPDS) type (00 is predict, 01 rapid, 02 and higher is final.)
003 = PGE Version Number
01 = Granule Production Version
V002 = LP DAAC Release Number

Notes on extracting dates:

The date of acquisition is stored in the file name itself. As with any geospatial data, access platforms, APIs, or other tools allow the user to provide explicit spatial and temporal parameters to subset the desired data for your purposes. While you may know the files you accessed or downloaded are from the desired time period, the GEDI datasets do not store the date for each shot. Therefore, the user will have to extract and assign the date to each observation to facilitate temporal comparisons when compiling all available data into an analysis ready format.

In all of the GEDI data acquisition options highlighted in this section, the user will have to manually add a step in the workflow for extracting the date for each observation and adding it to the desired format. This typically involves parsing the file name for the acquisition date/time information (YYYYDDDHHMMSS) from a string to date format involving a date conversion from julian to gregorian (month, day, year), then separating the date only portion from the acquisition time.

For HDF5 file, add a dataset for the data for each shot observation in HDF5 format or add an attribute or column during the process of converting the file into other formats like dataframe, csv, or shapefile in R or Python for example.
For Google Earth Engine, both vector table or monthly raster data require a similar date extraction and conversion process where the date information is accessed from the system:time_start and system:time_end metadata property. The monthly raster product however, does not preserve the date for each individual observation. Instead it aggregates all the observations for each month into its own raster file to which it designates the given month and year.

Notes on versioning:

Be sure to choose the latest version of the GEDI datasets, as these are the most updated/improved results, like V002 here. NASA EarthData and the DAACs tend to keep all versions of the products open source and accessible in the search but allow for explicit selection. Detailed updates to the datasets are tracked by the GEDI Mission team on the mission webpage and associated User Guides.

If you are accessing GEDI data from a source not directly connected to the NASA data centers (like from Google Earth Engine data catalog, or other applications or tools serving the data not directly from publication source), there may be lag in updating the data to the latest version.

How are the GEDI HDF5 files organized?

For GEDI, the highest hierarchy of Group, is the product level such as ‘GEDI01_B’, ‘GEDI02_A’, ‘GEDI03‘, ‘GEDI04_A’, etc.

The root group, where all datasets for the selected level data product, is under the BEAMXXXX/ group. There are 8 beams, and therefore 8 groups with datasets and attributes organized under the BEAM root group with XXXX corresponding to the respective beam number.

The root /BEAMXXXX group contains the core datasets or other relevant variables for each product level, Additional processing, algorithmic, waveform, sensor, model, or ancillary data used to generate the datasets are found within the subgroups of /BEAMXXXX. The “XXXX” indicates that information is separated by “folders” for each of the eight beams, /BEAM1000, /BEAM0001 etc. The science dataset definitions and metadata information are linked in the table.

Example structure:

Root Group/

Group /
- ‘Dataset’
- ‘Dataset’
Group /
- ‘Dataset’
Group /Metadata/
- ‘Dataset’

The designated hierarchy and organization of each product level served in HDF5 format are shown here. Housed under the the highest order Group (e.g. GEDI L1B) are several groups with datasets and associated attributes distributed. Definitions for each of these can be found in each product user guides and SDS Data Dictionaries.

Root Group: GEDI L1B /

Group: /METADATA/DatasetIdentification
Group: /BEAMXXXX
Group: /BEAMXXXX/ancillary
Group: /BEAMXXXX/geolocation
Group: /BEAMXXXX/geophys_corr

File size ~7 GB, Number of Science Dataset (SDS) Layers: 92 per beam.

Root Group: GEDI L2A /

Group: /METADATA/DatasetIdentification
Group: /BEAMXXXX
Group: /BEAMXXXX/ancillary
Group: /BEAMXXXX/geolocation
Group: /BEAMXXXX/land_cover_data
Group: /BEAMXXXX/rx_1gaussfit
Group: /BEAMXXXX/rx_1gaussfit/ancillary
Group: /BEAMXXXX/rx_assess
Group: /BEAMXXXX/rx_assess/ancillary
Group: /BEAMXXXX/rx_processing_aN
Group: /BEAMXXXX/rx_processing_aN/ancillary

File size ~5 GB, Number of Science Dataset (SDS) Layers: 530 per beam.

Root Group: GEDI L2B /

Group: /METADATA/DatasetIdentification
Group: /BEAMXXXX
Group: /BEAMXXXX/ancillary
Group: /BEAMXXXX/geolocation
Group: /BEAMXXXX/land_cover_data
Group: /BEAMXXXX/rx_processing

File size ~1 GB, Number of Science Dataset (SDS) Layers: 186 x 8 beams.

Root Group: GEDI L4A /

Group: /METADATA/DatasetIdentification
Group: /BEAMXXXX
Group: /BEAMXXXX/geolocation
Group: /BEAMXXX/agbd_prediction
Group: /BEAMXXXX/land_cover_data
Compound dataset: /ANCILLARY/model_data
Compound dataset: /ANCILLARY/pft_lut
Compound dataset: /ANCILLARY/region_lut

Root Group: GEDI L4C /

Group: /METADATA/DatasetIdentification
Group: /BEAMXXXX
Group: /BEAMXXXX/geolocation
Group: /BEAMXXXX/land_cover_data
Group: /BEAMXXX/wsci_prediction

Notably, the shared or inherited datasets per shot may not be housed within the same hierarchical group or subgroup across each product level. For example, /BEAMXXXX/lat_lowestmode, an original dataset from L2A, is inherited in L2B and located by this path /BEAMXXXX/geolocation/lat_lowestmode. Knowing the specific paths identifying the locations of each dataset within each product is important when subsetting each file for the selected variables, especially when trying to grab datasets from multiple products (matching across files) for a given shot. The identifier ‘shot_number’ or the lat/lon of the lowest mode for each shot are good options for matching across product levels, given that higher level products may have removed unqualified shots during development, leading to possible differences in the total number of shots available over a given area and time period between each product.

Source: Example subset Waveform Structural Complexity Index (WSCI) over the Eastern Amazon (De Conto et al., 2024).

Gridded Product Level Formats

The gridded products GEDI03, GEDI04_B are derived from the footprint level data. L3 is based on the latest versions of the L2 geolocated footprint level profile metrics, and L4B is derived from L4A. Each is in TIF image format.

GEDI Level 3 Gridded Land Surface Metrics

The L3 product is the “gridded mean canopy height, standard deviation of canopy height, mean ground elevation, standard deviation of ground elevation, and counts of laser footprints per 1-km x 1-km grid cells globally…[and] can be used to characterize important carbon and water cycling processes, biodiversity, habitat and can also be of immense value for climate modeling, forest management, snow and glacier monitoring, and the generation of digital elevation models.” (Dubayah, et al., 2021)). These data provide one-time estimates for each variable at 1km spatial resolution between 2019-04-18 to 2023-03-22, to-date. See the L3 ATBD for detailed information on how this data was produced.

Source: (Dubayah et al., 2021).

File Naming Convention for L3

Time periods: Each of the 5 variables has a file corresponding to 5 different mission time periods footprints were collected from to compute the grids, making 25 files for this product. Additional time periods (temporal coverage) will be expanded in future versions. To-date, these time periods include

19th through the 223rd mission weeks (2019-04-18 to 2023-03-22)
19th through the 143rd mission weeks (2019-04-18 to 2022-01-19)
19th through the 138th mission weeks (2019-04-18 to 2021-08-04)
19th through the 122nd mission weeks (2019-04-18 to 2021-04-14)
19th through the 96th mission weeks (2019-04-18 to 2020-10-13)

How are the L3 files organized?

GEDI L3 (counts and mean and standard deviation of both elev_lowestmode and rh100)
GEDI03_counts_<start_date>_<end_date>_<release>_<version>.tif
GEDI03_elev_lowestmode_mean_<start_date>_<end_date>_<release>_<version>.tif
GEDI03_elev_lowestmode_stddev_<start_date>_<end_date>_<release>_<version>.tif
GEDI03_rh100_mean_<start_date>_<end_date>_<release>_<version>.tif
GEDI03_rh100_stddev_<start_date>_<end_date>_<release>_<version>.tif

GEDI Level 4B Gridded Above Ground Biomass Density

The L4B product “provides 1 km x 1 km (1 km, hereafter) estimates of mean above ground biomass density (AGBD) based on observations from mission week 19 starting on 2019-04-18 to mission week 223 ending on 2023-03-16.” The gridding procedure uses the hybrid model-based mode of inference used in the L4B product, later described in the biomass section of this training. Uncertainty is due to both GEDI’s sampling of the 1 km area (as opposed to making wall-to-wall observations) and the fact that L4A biomass values are modeled and subject to error compared to a measured process that may be assumed to be error-free. ((Dubayah et al., 2022)).

Source: The GEDI Mission.

File naming convention for L4B

Time periods:

Each of the 10 variables corresponds to the 19th through the 223rd mission weeks (2019-04-18 to 2023-03-22) mission time periods and footprints were collected to compute the grids.

How are the L4B files organized?

GEDI L4B File	Variable
GEDI04_B_MW019MW223_02_002_02_R01000M_MU.tif	Mean above ground biomass density (MU) including forest and non-forest
GEDI04_B_MW019MW223_02_002_02_R01000M_V1.tif	Variance component 1 (V1): uncertainty in the estimate of mean biomass
GEDI04_B_MW019MW223_02_002_02_R01000M_V2.tif	Variance component 2 (V2): characterising uncertainty by sampling or model method
GEDI04_B_MW019MW223_02_002_02_R01000M_SE.tif	Standard error (SE) of mean AGBD
GEDI04_B_MW019MW223_02_002_02_R01000M_PE.tif	Standard error as a fraction of estimated mean AGBD (PE)
GEDI04_B_MW019MW223_02_002_02_R01000M_NC.tif	Number of clusters (NC): number of ground tracks with quality waveforms in the grid
GEDI04_B_MW019MW223_02_002_02_R01000M_NS.tif	Number of samples (NS): total number of quality waveforms in the grid
GEDI04_B_MW019MW223_02_002_02_R01000M_QF.tif	Quality flag (QF)
GEDI04_B_MW019MW223_02_002_02_R01000M_PS.tif	Prediction stratum (PS): determined by plant function type and continent.
GEDI04_B_MW019MW223_02_002_02_R01000M_MI.tif	Mode of inference (MI): Indicating which model was applied to the grid.

Overview and Formatting of Derived Products

Several derived products hosted by the NASA ORNL DAAC facilitate the use of GEDI data with improved accuracy and error estimations, in multiple formats, and spatial scales. Each offers additional perspectives on the utility of GEDI spaceborne lidar for country level analysis, improved height and biomass predictions in over particular regions by way of fusion with InSAR and other spaceborne lidar data and implementing additional quality assessments.

GEDI L4B Country-level Summaries of Above Ground Biomass

Country level estimates of above ground biomass are compiled for convenience alongside respective uncertainty metrics based on L4B version 2.1. The L4B country level estimates are compared to FAO biomass estimates as a reference, and additionally compared to the L4B version 2 to evaluate results between methods deployed in each version. Differences are captured by new land cover filters applied and extended temporal coverage. These country level summaries found in the companion file “summary_country_estimates_agbd.pdf” help facilitate use of GEDI and comparisons of biomass estimates. This product is particularly useful for users investigating country-level research and analysis in comparison to FAO estimates as opposed to creating the country-level estimates from the L4B or L4A products themselves.

Source: (Armston et al., 2023).

File naming conventions for L4B country level above ground biomass estimates

Time periods:

The country level data is derived from L2B 2.1 spanning 2019-04-18 to 2023-03-22 mission time periods while the L2B version 2 spans between 2019-04-18 to 2021-08-04.

How are the files organized for L4B country level above ground biomass estimates?

Variable in the File	Description
Country
ISO3	Three-letter country codes defined in ISO 3166-1 international standard
Percent_forest	Percent area of country covered by forest defined by FAO
FAO_Forested_AGBD	FAO country estimate of mean AGBD for forests
FAO_Total_AGBD	FAO country estimates of mean AGBD
GEDI_L4B_Total_AGBD	Country estimates of mean AGBD
GEDI_L4B_AGBD_SE	Standard error of mean AGBD
GEDI_L4B_AGBD_SE_Percent	Standard error of mean AGBD percentage
FAO_AGB	FAO total AGBD in peta grams
GEDI_L4B_AGB	Estimates of total AGBD in peta grams
GEDI_L4B_AGB_SE	Standard error of total AGBD in peta grams

Pantropical Forest Height and Biomass from GEDI and TanDEM-X Data Fusion

File naming conventions for Pantropical Forest Height and Biomass from GEDI and TanDEM-X Data Fusion

Canopy height and biomass maps over Mexico, Gabon, French Guiana, and the Amazon Basin are produced from fusing InSAR TanDEM-X images with GEDI L1B and L2A data. The TanDEM-X InSAR coherence maps invert heights are calibrated using GEDI canopy height as a reference. The fused canopy heights were then used as the basis for generating biomass using previous GEDI biomass datasets as training data within a generalized hierarchical models framework. Additional data included track uncertainties and forest changes due to disturbance (disturbances interpreted from global 2000-2020 land cover and land use change Landsat based maps (Potapov et al., 2022)).

Source: (Dubayah et al., 2023).

Time periods:

GEDI metrics from 2019-04-18 to 2021-08-18 and TanDEM-X images from 2011-01-06 to 2020-12-31 are used.

How are the files organized for Pantropical Forest Height and Biomass from GEDI and TanDEM-X Data Fusion?

File name	Description
biomass_<loc>_<res>.tif	Mean AGB
biomass_uncertainty_<loc>_<res>.tif	Standard error of mean AGB
height_<loc>_<res>.tif	Forest canopy height
height_uncertainty_<loc>_<res>.tif	Standard error of mean forest canopy height
disturbance_<loc>_<res>.tif	Year for last forest disturbance in 2011-2020. 0 = no disturbance. Values 1 to 10 = year of disturbance - 2010.
model_selection_<loc>_<res>.tif	Model parameter indicating size of analysis window used in calibration. Values 0 to 12 correspond to windows of 2 to 50 km across, in 4-km intervals. Value of 13 indicates a study area-wide window was used.

Global Vegetation Height Metrics from GEDI and ICESat2

The ICESat-2 spaceborne lidar mission has been in operation since 2018. Combining the data from ICESat and GEDI is an opportunity to increase the sampling density of elevation, height, and structure metrics. There are many challenges to combining different remotely sensed data stemming from the inherent differences in measurement properties, spatial resolution differences, and distinct orbital collection patterns affecting coverage distributions. Leading scientists culminating decades of experience advancing spaceborne lidar technologies, have worked to enhance global vegetation height metrics with both missions. This gridded product is an intercalibrated dataset of global vegetation measurements aggregated at multiple resolutions (100, 200, 500, 1000m) to improve estimations with increased spatial coverage and geolocation errors. ICESat-2 data from L3A ATL08 Land and Vegetation Height data products from 2019, 2020, and 2021 (April-October) at 100m grids were intercalibrated with GEDI’s 25m relative height metrics (RH50, 75, 90, and 98). The GEDI data from 2019-2022 were filtered based on quality and degrade flag, number of detected modes, sensitivity threshold 0.92 for tropical and 0.97 for non-tropical regions, reference elevation differences <75m, and using power beam only. This filtering helped reduce the systematic bias reviewed between the two lidar systems among other methods. Gradient booted tree models were developed for different land covers to relate the 100m ICESat-2 waveform to overlapping GEDI RH metrics. The Copernicus Landcover classification (Buchhorn et al., 2020) was used to create specific models for:

shrubs, herbaceous, moss, and croplands
evergreen needle leaf forests
deciduous needle leaf
deciduous broad leaf and other forests

Source: (Saatchi et al., 2023).

File naming conventions for Global Vegetation Height Metrics from GEDI and ICESat2

Time periods:

Derived from GEDI data from 2019-2022 and ICESat2 data from 2019-2021 to compute the grids.

How are the files organized for the Global Vegetation Height Metrics from GEDI and ICESat2?

Each data file represents results from either the GEDI or ICEsat-2 for RH50, 75, 90, 98 with an additional file for count of GEDI samples found within the 100m ICESat pixel. Every combination of instrument and RH metric are presented as a file at each of the 4 resolutions.

Gridded GEDI Vegetation Structure Metrics and Biomass Density at Multiple Resolutions

This product provides a global, analysis ready, multi-resolution vegetation structure metrics derived from L2 and L4A footprints. Grids at 1km, 6km, and 12km are available annually and for the entire mission duration (2019-04-17 to 2023-03-16).The gridded metrics for each combination of resolution and time period include canopy height, canopy cover, plant area index, foliage height diversity, and plant area volume density at 5 m strata, each with eight statistics for each shot mean, bootstrapped standard error of the mean, median, standard deviation, interquartile range, 95th percentile, Shannon’s diversity index, and shot count, and 2 additional shot counts tallying the shot suitability for gridding by ground elevation or vegetation metrics. The more rigorous quality shot filtering method was adopted from L4B Gridded AGBD methods, and only shots under leaf-on conditions, high geolocation accuracy, additional flags and outlier detection, and landcover threshold requirements, and elevation difference considerations were deployed. Users are encouraged to explore the data using ‘countf’ to select grids that used more than the minimum 2 shots required to generate the grids, particularly in the tropics. Validation assessments and product improvements are ongoing.

Source: Mean foliage height diversity of GEDI shots acquired from April 2019 to March 2023 aggregated in 6-km grid cells (Burns et al., 2024).

File naming conventions for Gridded GEDI Vegetation Structure Metrics and Biomass Density at Multiple Resolutions

Time periods:

Each of the 26 metrics from L2A and L4A and 10 custom metrics derived from L2 were gridded at 3 spatial resolutions for each year between 2019-2023 and the full mission time period.

How are the files organized for Gridded GEDI Vegetation Structure Metrics and Biomass Density at Multiple Resolutions?

GEDI metric name/file	Description
agbd-a0-qf	Predicted above ground biomass density with the l4_quality_flag applied (Mg ha- 1)
agbd-a0	Predicted above ground biomass density without the l4_quality_flag applied (Mg ha-1)
cover-a0	Total canopy cover, defined as the percent of the ground covered by the vertical projection of canopy material (unitless)
date-dec	Decimal date of acquisition (YYYY.nnnnn)
elev-lm-a0	Elevation of center of lowest mode (ground elevation) relative to WGS84 ellipsoid (meters)
even-pai-1m-a0	Evenness of the L2B 1 m vertical Plant Area Index profile (m-1). Calculated as: fhd_normal / log(ceiling(rh100))
even-pavd-5m-a0	Evenness of the L2B 5 m vertical Plant Area Volume Density (PAVD) profile (m-1). Calculated as:If (rh-100-a0 > 5) { fhd-pavd-5m-a0 / log (number nonzero PAVD bins) }
fhd-pai-1m-a0	Foliage height diversity (FHD), or Shannon entropy index, calculated from 1-m vertical bins in the foliage profile, normalized by total plant area (PAI) index (unitless)
fhd-pavd-5m-a0	FHD estimated from L2B 5 m plant area volume density (PAVD) vertical profile normalized by total PAVD (unitless)
num-modes-a0	Number of detected modes in rxwaveform (unitless)
pai-a0	Total Plant Area Index (PAI; m2 m-2)
pavd_0-5-frac	The fraction of PAVD in 0 to 5 m height bin relative to the sum of PAVD from all height bins (unitless)
pavd_X-Y	PAVD from X m to Y m (m2 m-3) where x and y start at 0-5 and increment by 5 until pavd_75-80.
pavd-bot-frac	Fraction of PAVD in the bottom half of the canopy relative to the sum of PAVD fromall height bins (unitless). The midpoint is calculated as: (round((rh-100- a0/2)/5)*5)/5
pavd-max-h	The upper height of the 5 m bin with maximum PAVD (m)
pavd-top-frac	Fraction of PAVD in the top half of the canopy relative to the sum of PAVD from all height bins (unitless). The midpoint is calculated as: (round((rh-100-a0/2)/5)*5)/5
rh-50-a0	Relative height (RH) at the 50th percentile of returned energy; height of median energy (m)
rh-95-a0	RH at the 95th percentile of returned energy; a proxy for canopy height (m)
rh-98-a0	RH at the 98th percentile of returned energy; a proxy for canopy height (m)
rhvdr-b	Bottom canopy vertical distribution ratio (VDR; unitless). Calculated as:If (rh-100- a0 > 5 & rh-50-a0 >= 0 & rh-98-a0 >=0) { rh-50-a0 / rh-98-a0 }
rhvdr-m	Middle canopy VDR (unitless). Calculated as:If (rh-100-a0 > 5 & rh-25-a0 >= 0 & rh 75-a0 >= 0 & rh-98-a0 >=0) { (rh-75-a0 - rh-25-a0) / rh-98-a0 }
rhvdr-t	Top canopy VDR (unitless). Calculated as:If (rh-100-a0 > 5 & rh-50-a0 >=0 & rh-98- a0 >=0) { (rh-98-a0 - rh-50-a0) / rh-98-a0 }
sens-a0	Maximum canopy cover that can be penetrated considering the SNR of the waveform (unitless)

Other Derived Datasets

Additional fusion based models and derivations such as for carbon stock can be found on the Earthdata catalog where all versions of the published data are available under GEDI data access and tools information.

In sum, each dataset carries capabilities and constraints dependent on systematic and methodological biases, spatial and temporal sampling patterns, and ecosystem specifications. The metrics may vary in their performance across different biomes. Gridded products do not offer enough detail for certain contexts. Yet, footprints and waveforms are sensitive to detecting some disturbances or structural characteristics when quality controls are explored, data fusion is deployed, or calibration is performed. Before deciding on a product to use, review the requirements and potential limitations for footprint level or lower resolution products when using data for training, validation, fusion, or wall-to-wall mapping. Ensure the GEDI aligns temporally with other datasets and especially consider time-dependent phenology or other ecosystem and forest structure changes. Lastly, prepare for the technical and knowledge requirements for optimizing data accuracy with extensive quality pre-processing and validation design.

Knowledge Check #1
Pick 3 application considerations over a hypothetical savanna ecosystem you are hoping to study: spatial coverage, time period at observation, error and uncertainty, wall-to-wall mapping, or producing yearly estimates. For each consideration, describe a distinct and/or shared trade-off between the footprint and gridded level products for GEDI’s elevation, height, and canopy height data.