GEDI’s Biomass Estimation Approach

Visualize GEDI and ICESat-2 Biomass Estimates Source: Video from NASA Scientific Visualization Studio.

In this section:

What biomass is and why it matters
How GEDI measures biomass
Key L4A and L4B products
How to use and interpret these products
Applications and case studies

Biomass is critical for understanding and managing environmental change, as it represents one of Earth’s largest terrestrial carbon pools (Pan et al., 2011). Changes in biomass directly affect climate through carbon cycling—increasing biomass removes CO2 from the atmosphere while decreasing biomass releases stored carbon, creating important climate feedbacks (Bonan, 2008). This makes biomass monitoring essential for carbon accounting under international climate agreements and validating emission reduction targets (Herold & Johns, 2007).

Beyond carbon, biomass distribution strongly correlates with biodiversity patterns, as higher biomass areas typically support more species and complex ecological communities, though the relationship varies by ecosystem type (Gillman et al., 2015). Biomass also underpins virtually all ecosystem services humans depend upon, including water regulation, soil stabilization, air purification, and coastal protection, with higher biomass ecosystems generally providing greater service capacity (Costanza et al., 2017). Biomass serves as an excellent indicator of overall environmental condition, revealing cumulative impacts of climate change, pollution, and land use change that might be difficult to assess individually (Pettorelli et al., 2005). While this lecture and tutorial will focus more on forest ecosystems, biomass monitoring provides critical information for many ecosystems: grasslands, prairies, agricultural systems, wetlands, shrublands, savannas, aquatic ecosystems, etc.

What is Biomass?

Understanding biomass gives us the foundation for why GEDI data is important: it’s the currency of carbon storage, ecosystem health, and climate change.

Biomass is the total mass of living organisms in a given area or ecosystem, often expressed as carbon mass (e.g., Mg C/ha).

Main biomass components:

Above Ground biomass (AGB) = dry mass of live vegetation above soil; often measured using remote sensing techniques or estimated using allometric equations

Below Ground biomass (BGB) = dry mass of live vegetation below soil; often estimated using root-to-shoot ratios

Woody/Litter Biomass = fallen/dead material which continues to contribute to carbon storage, cycling, and structure of the ecosystem.

Soil Organic Carbon = decomposed biomass, often included in carbon assessments

Higher AGB values are typically associated with denser forests, favorable growing conditions, and increased resource utilization (Ryan & Yoder, 1997). Changes in AGB over time can indicate environmental stressors such as drought, pollution, infestation, climate change, and anthropogenic impacts (Clark et al., 2001), as well as positive processes like forest regeneration, afforestation, and reforestation. While BGB Indicates soil health and nutrient availability. Stressed systems will generally see a decline in BGB (Brunner et al., 2015).

Biomass in Various Ecosystems:

Forests:
- Tropical forests: Highest biomass density, rapid growth rates, strong carbon sinks (Malhi, 1999; Keith et al., 2009).
- Temperate forests: Moderate to high biomass with total carbon storage greater than boreal forests, and average biomass values higher than both tropical and boreal forests in some regions Oxford AcademicPubMed (Keith et al., 2009; Xu et al., 2013).
- Boreal forests: Lower biomass density but vast extent, with significant carbon stored in soils and permafrost where cold temperatures prevent complete breakdown of dead biomass (Malhi, 1999).
Grasslands: Lower biomass, but significant soil carbon.
Wetlands: Can store large carbon amounts in soil.
Mangroves: Exceptionally high carbon storage averaging 1023 Mg/ha, storing 3-5 times more carbon per area than tropical forests and sequestering carbon at rates ten times greater than mature tropical forests ScienceDirectTaylor & Francis Online (Donato et al., 2011; NOAA, 2024).

Biomass in Policy & Reporting:

Monitoring biomass is essential to ensure countries are meeting their climate commitments under agreements such as the Paris Climate Accord and REDD+ programs (Reducing Emissions from Deforestation and Forest Degradation). These reporting mechanisms track changes in land use and land cover along with biomass measurements to estimate forest inventories, validating carbon offset programs, and ensuring climate policies are being addressed. These reports provide insight into conservation efforts at the national level, identifies high-risk areas, monitoring disaster areas and evaluates management strategies.

National greenhouse gas inventories (NFI’s) are one avenue used to document nation-wide information on forest resources and policy development. NFI’s primarily rely on ground-based sampling, across permanent plots distributed across a country, which are used to determine and track tree diameters, heights, species, and health. The gathered information is then applied to allometric equations to estimate biomass.

As more remotely-sensed datasets become available with methods tested and documented, governments are beginning to combine space-based information with that of ground-data to produce more precise estimates of forest conditions.

Although lidar has traditionally been an expensive option for biomass monitoring using airborne or ground-based approaches, spaceborne lidar systems such as GEDI are making it increasingly cost-effective and offer different capabilities than non-lidar remote sensing:

Technique	Pros	Cons	Citations
Optical	Free, large-area coverage, long time series	Saturates in dense forests, cloud-sensitive, indirect	Lu, D. (2006) Asner, G. P., et al. (2010)
SAR	All-weather, structural sensitivity, can detect degradation	Saturation at high biomass, complex processing	Saatchi, S. S., et al. (2011) Lucas, R., et al. (2015)
Lidar	Direct 3D structure, accurate biomass estimates	Limited coverage, expensive, requires calibration	Dubayah, R., et al. (2020) Lefsky, M. A., et al. (2002)

How GEDI Models Biomass:

Knowing how GEDI turns laser signals into biomass estimates helps us trust the data,

interpret uncertainty, and choose the right variables for our analysis.

GEDI biomass products (L4A, L4B) sit at the core of a wide range of ecological, climate, and land management applications. How you filter and process GEDI data critically shapes the outputs and determines suitability for different applications. L4A (footprint-level) provides raw high-resolution sampling, while L4B (gridded) provides regionally representative biomass estimates.

Before diving into the biomass estimates themselves, we first need to understand how those estimates were generated:

Several biomass models include variable inputs and outputs of data for analysis, statistical metrics, ancillary information, and quality or systematic tracking variables.

Equations (Pre-Launch Calibration):

Before launch, GEDI’s biomass calibration equations were built by combining lidar-derived structural metrics with standardized ground plot measurements. Airborne and terrestrial lidar were used to simulate GEDI-like waveforms, which were then calibrated against field biomass data to develop predictive equations for different forest types. These equations form the backbone of GEDI’s operational system.

From Equations to Operations:

Once in orbit, GEDI applies these pre-calibrated equations to new laser measurements from space. To do this, the system combines several types of input data with the equations, and then produces a set of outputs that describe biomass and its uncertainty:

Input:

Primary Input: GEDI laser measurements from space that measure forest height and structure
Land Cover Information: Satellite data that tells the algorithm what type of vegetation is in each location (tropical forest, grassland, etc.)
Geographic Regions: Information about which part of the world each measurement comes from, since forests behave differently in different regions
Calculation Models: Mathematical formulas specifically developed for different forest types and regions
Algorithm Selection: The system automatically chooses the best method to analyze each laser shot, though sometimes it needs to use backup methods when the primary approach doesn’t work well (like when there’s interference or noise in the signal)

Output:

Biomass Estimates: The main result - how much vegetation mass is present (measured in metric tons per hectare)
Confidence Intervals: A range showing the likely accuracy of each estimate (for example “between 150-200 tons per hectare”)
Quality Flags: Indicators telling you whether each measurement is reliable or should be used with caution
Multiple Options: Several biomass estimates using different calculation methods, with the algorithm recommending which one is most reliable for each location
Supporting Data: Additional measurements and calculations that help explain how the biomass estimate was determined

While the product dictionary and documentation provides a comprehensive, detailed set of tables, breaking down the different groups. Here, we’ve regrouped those variables to better understand the use of each variable as either input or output.

Primary Biomass Outputs

These three variables form the foundation of most GEDI L4A analyses. The agbd variable gives you the biomass estimate, agbd_se tells you its uncertainty, and l4_quality_flag helps you decide whether to trust it.

Variable Name	Units	Description	Why It Matters
agbd	Mg/ha	Selected best biomass prediction	This is your main result - the algorithm’s best estimate of forest biomass density at each location
agbd_se	Mg/ha	Standard error for selected prediction	Essential for uncertainty assessment - tells you how reliable each biomass estimate is
l4_quality_flag	0 or 1	Overall quality flag for selected prediction	Critical for data filtering - use this to identify high-quality measurements vs. potentially unreliable ones

Essential Geolocation Data

These variables are mandatory for any spatial analysis, allowing you to map biomass estimates and integrate with other geospatial datasets.

Variable Name	Units	Description	Why It Matters
lat_lowestmode	degrees	Latitude of measurement location	Required for mapping - precise coordinates for placing biomass estimates geographically
lon_lowestmode	degrees	Longitude of measurement location	Required for mapping - essential for all spatial analyses and visualization
shot_number	-	Unique observation identifier	Data management - critical for linking to other GEDI products and avoiding duplicate measurements

Useful Supporting Variables:

These variables provide additional context and help interpret biomass patterns, especially useful for ecological analysis and data validation and help explain why biomass varies across landscapes and provide ecological context for interpreting results.

Land Cover Context:

Variable Name	Units	Description	Practical Application
landsat_treecover	percent	Tree cover in 2010 (canopy >5m)	Validation tool - helps verify that high biomass areas correspond to dense forest cover
pft_class	-	Plant Functional Type from MODIS	Ecosystem classification - distinguishes between forest types (evergreen, deciduous, mixed) for targeted analysis
region_class	-	World continental regions (0-7)	Geographic stratification - enables continent-specific analyses and comparisons

Source: The gray box indicates the 56.6 degrees N and S latitude range of GEDI. The plant functional types (PFT) from the Global stratification by five combinations MODIS MCD12Q1 V006 PFT dataset are in subfigure A. In subfigure B are the world regions also used to produce the footprint level AGBD models. PFT: DBT (deciduous broadleaf trees), DNT (deciduous needleleaf trees), EBT (evergreen broadleaf trees), ENT (evergreen needleleaf trees), GSW (grasses, shrubs, and woodlands). Regions: Af (Africa), Au (Australia and Oceania), Eu (Europe), N-Am (North America north of southern Mexico), N-As (North Asia), S-Am (South America, Central America, and southern Mexico, and the Caribbean), S-As (South Asia) (Kellner et al., 2021).

Extended Uncertainty Information:

Prediction intervals provide broader uncertainty ranges than standard errors, useful for risk assessment and conservative carbon accounting.

Variable Name	Units	Description	When To Use
agbd_pi_lower	Mg/ha	Lower prediction interval	Conservative estimates - use when you need the lower bound of likely biomass values
agbd_pi_upper	Mg/ha	Upper prediction interval	Maximum estimates - useful for understanding the upper range of possible biomass

Technical Variables:

These variables are primarily useful for algorithm development, detailed quality assessment, or specialized research applications.

Algorithm-Specific Outputs:

Variables ending in _aN (where N = 1-6) represent outputs from different algorithm groups:

`agbd_aN` - Biomass predictions from each algorithm group
`l2_quality_flag_aN` - L2 data quality for each algorithm group
`algorithm_run_flag_aN` - Algorithm execution status

####

Model Parameters & Configuration:

The ANCILLARY folder contains mathematical details of biomass prediction generation, including model group classifications (`model_group`), statistical coefficients (`par`), and variance-covariance matrices (`vcov`) used in the algorithm development process. These parameters are essential for researchers developing new biomass models or conducting advanced uncertainty analyses but are rarely needed for standard forest carbon applications.

These technical variables can be safely ignored by most users focused on biomass mapping and analysis.

Biomass Algorithm structure:

GEDI’s models take the form of parametric Ordinary Least Squares (OLS) models with simulated Relative Height (RH) metrics as predictor variables. GEDI processes data using multiple algorithm setting groups, with the system automatically selecting the most appropriate algorithm for each footprint based on environmental conditions and waveform quality. The system provides AGBD predictions for each of these algorithm setting groups with biomass in natural and transformed units and associated prediction uncertainty, allowing users to evaluate and select alternative algorithm setting groups.

Model Groups and Types

Model Group 1: All predictors considered (uses full set of RH metrics)
Model Group 2: No RH metrics below RH50 (excludes lower canopy information)
Model Group 3: Forced inclusion of RH98 (ensures tall vegetation capture)
Model Group 4: Forced inclusion of RH98 and no RH metrics below RH50 (combines approaches)

The algorithm employs a stratified modeling approach where different models are applied based on:

Plant Functional Type (PFT): Different forest types (e.g., DBT_Af = Deciduous Broadleaf Tree, Africa)
Geographic Region: Continental regions (Europe, North Asia, Australasia, Africa, South Asia, South America, North America)
Environmental Conditions: Leaf-on/leaf-off periods, urban proximity, water persistence

Statistical Framework

Base Method: Ordinary Least Squares (OLS) regression
Predictors: Scaled and transformed RH (Relative Height) metrics
Transformations: Square root, logarithmic, or no transformation applied to both predictors and response variables
Bias Correction: Back-transform bias correction methods
Uncertainty Quantification: Prediction intervals and standard errors provided

The system automatically selects the best algorithm setting group for each footprint based on:

Waveform quality and signal-to-noise ratio
Ground detection reliability
Environmental conditions
Model performance metrics

The comprehensive nature of this system allows users to access multiple biomass estimates per location and choose the most appropriate one for their specific application, while the stratified modeling approach ensures that predictions are tailored to local ecological conditions and geographic contexts.

Figure 1. Distribution of Selected L4 algorithm settings across the Southern Pine Beetle study area (SPB_AOI_GEDI_L4A). Frequency represents the number of GEDI shots, with the majority of shots (approximately 5,000 each) concentrated at algorithm settings of 1.0 and 2.0.

Algorithm selection distribution for a Southern Pine Beetle study area shows how GEDI’s automatic selection process favors algorithms 1 and 2 for this forest ecosystem, demonstrating ecosystem-specific algorithm performance in practice.

The dominance of algorithms 1 and 2 in this forest ecosystem suggests that:

Algorithm 1 (all predictors considered) works well for this vegetation type
Algorithm 2 (no RH metrics below RH50) is also effective, possibly because the lower canopy information isn’t critical for biomass estimation in this particular forest type
The absence of algorithms 3-6 and 10 indicates they performed poorly for this specific ecosystem’s characteristics

Knowledge Check #11
True or False?: GEDI’s pre-launch above ground biomass calibration equations were generated from lidar data only?

Accessing GEDI’s Biomass Products

L4A and L4B are the core GEDI biomass datasets. Knowing their differences ensures you pick the right tool for your research, whether you’re zoomed in on a forest plot or mapping a whole region.

GEDI provides a calibrated, consistent global biomass dataset not possible with optical or radar alone through their L4A and L4B products:

L1B: raw waveforms.
L2A: ground/canopy heights, RH metrics.
L2B: canopy cover, PAI, PAVD, FHD
L3: gridded elevation/canopy height metrics
L4: biomass (AGBD)
- L4A: footprint-level (~25 m) biomass estimates
- L4B: aggregated (~1 km) gridded biomass maps
L4C: waveforum structural complexity index (WSCI)

These biomass products sit at the core of a wide range of ecological, climate, and land management applications. How you filter and process GEDI data critically shapes the outputs and determines suitability for different applications. L4A (footprint-level) provides raw high-resolution sampling, while L4B (gridded) provides regionally representative biomass estimates.

Accessing and downloading GEDI L4 data:

NASA Earthdata Portal: GEDI L4A (footprint-level biomass density) and L4B (gridded biomass estimates) data are available through NASA Earthdata.

ORNL DAAC (Oak Ridge National Laboratory): Higher level GEDI products (L3 & L4) are available from the ORNL DAAC, while lower level products (L1 & L2) are available from NASA LP DAAC.

GEDI L4A: https://doi.org/10.3334/ORNLDAAC/2056
GEDI L4B: https://doi.org/10.3334/ORNLDAAC/2299

Google Earth Engine: Available as both footprint-level data, monthly raster composites, and gridded biomass providing predictions of above ground biomass density (AGBD in Mg/ha) and prediction standard error estimates.

L4A Footprint-level: `LARSE/GEDI/GEDI04_A_002`
L4A Monthly Raster: `LARSE/GEDI/GEDI04_A_002_MONTHLY`
L4B Global Grid: `LARSE/GEDI/GEDI04_B_002`

###

L4A Footprint Level Biomass Estimations

GEDI L4A Footprint Level Above Ground Biomass Density, Version 2.1

Product Summary:

Spatial Extent	N: 55.7983 S: -53 E: 180 W: -180
Coordinate System	Cartesian
Temporal Extent	2019-04-17 to 2024-11-27
NASA DAAC	Oak Ridge National Laboratory
Concept ID	c2237824918-ornl_cloud
Processing Level	4
Data Format	HDF5
Spatial Resolution	Footprint: ~25m in diameter

Naming Convention:

GEDI04_A_YYYYDDDHHMMSS_O[orbit_number]_[granule_number]_T[track_number]_[PPDS_type]_
[release_number]_[production_version]_V[version_number].h5

where:

GEDI04_A	product short name
YYYYDDDHHMMSS	date and time of acquisition in Julian day of year, hours, minutes, and seconds format
[orbit_number]	orbit number
[granule_number]	sub-orbit granule (or file) number
[track_number]	track number
[PPDS_type]	positioning and pointing determination system (PPDS) type (00 is “predict”, 01 is “rapid”, 02 and higher is “final”)
[release_number]	release number (002), representing the SOC SDS (software) release used to generate this L4A dataset, The granules with a release number <=’002’ were processed by the GEDI Science Team at the University of Maryland. Those with a release number >=’003’ were processed by the Science Operations Center (SOC) at Goddard Space Flight Center. The SOC started processing GEDI L4A granules in mission week 163 (2022-01-20).
[production_version]	granule production version , e.g., a particular data granule (or file) may have been regenerated multiple times
[version_number]	L4A dataset production version (002), corresponding to the ORNL DAAC’s dataset version number
.h5	file extension, HDF5 format

File Structure:

The data dictionary provides a detailed description of each available variable, either directly measured or derived. Each GEDI L4A data file is like a filing cabinet with several main folders:

Folder/Subfolder	Description	Contents
METADATA	Contains basic information about the dataset	Dataset overview and general information.
BEAM0000- BEAM1011	Individual folders for each of GEDI’s eight laser beams	All biomass measurements and related information for each beam.
└─ Main beam data	Core measurement data	Biomass predictions, quality flags, uncertainty estimates
└─ Geolocation subfolder	Geographic positioning data	GPS coordinates (latitude, longitude). Information is also organized by different algorithm groups
└─ Land cover subfolder	Land surface classification data	Tree cover information from Landsat satellites, land cover type, urban area classifications, seasonal vegetation information.
└─ Biomass prediction subfolder	Multiple biomass calculation methods	Multiple biomass estimates using different calculation methods, uncertainty measures for each estimate, and options to choose the best estimate for specific needs.
ANCILLARY	Reference and calculation support data	Model parameters used to calculate biomass and lookup tables that translate codes into meaningful categories.

L4B Gridded Biomass Estimations

GEDI L4B Gridded Above Ground Biomass Density, Version 2.1

Product Summary:

Spatial Extent	N: 52 S: -52 E: 180 W: -180
Coordinate System	Cartesian
Temporal Extent	2019-04-18 to 2023-03-16*
NASA DAAC	Oak Ridge National Laboratory
Concept ID	c2792577683-ornl_cloud
Processing Level	4
Data Format	Cloud Optimized GeoTIFF
Spatial Resolution	1 km x 1 km

*L4B has limited temporal coverage compared to L4A due to processing differences

Naming Convention:

GEDI04_B_[Mission_Week_Range]_[PPDS]_[SDS]_[version]_[spatial_resolution]_[variable].tif

where:

GEDI04_B	product short name representing GEDI Level 4B data (gridded above ground biomass)
[Missions_Week_Range]	Mission week range: MWxxxMWxxx (start week 19 to end week 138)
[PPDDS]	Positioning and pointing determination system type
[SDS]	GOC/SDS software release number
[version]	Product version number
[spatial_resolution]	Spatial resolution indicator: R01000M (1 km = 01000 meters)
[variable]	Specific layer contained in the file, such as: MU – Mean AGBD estimate V1 / V2 – Variance components (model vs sampling error) SE – Standard error of the mean PE – Percent error (standard error as fraction of mean) NC – Number of GEDI clusters (tracks) in grid cell NS – Number of GEDI samples (footprints) QF – Quality flag PS – Prediction stratum MI – Mode of inference used
.tif	file extension, image

GEDI L4B Visualization via Google Earth Engine Application

Key Differences between L4A and L4B Products

The GEDI L4A and L4B data products serve distinct but complementary purposes, with key differences in resolution, format, use cases, and accessibility.

GEDI L4A provides footprint-level above ground biomass density (AGBD) estimates tied to individual GEDI laser footprints (~25 m diameter), representing high-resolution samples from each GEDI shot (Dubayah et al., 2022a; Kellner, Armston, & Duncanson, 2023). In contrast, GEDI L4B reports aggregated mean AGBD on a 1 km × 1 km grid, a statistical inference of the mean for each cell based on the sampled footprints (Dubayah et al., 2022b; Healey, Yang, Patterson, & Ghazoul, 2023). The 25 m footprint resolution of L4A supports plot- or site-scale analyses, such as detailed structural studies, whereas the 1 km resolution of L4B is explicitly designed for regional-to-global mapping and trend analyses where coarser, wall-to-wall products are more appropriate (Duncanson, Kellner, Armston, et al., 2022).

The L4A product is distributed as HDF5 files (*.h5) that preserve footprint-level attributes, model inputs, multiple algorithm settings, and per-footprint uncertainty metrics, which suits programmatic workflows and detailed calibration/diagnostics (Dubayah et al., 2022a). L4B, in contrast, is distributed as cloud-optimized GeoTIFFs and is also accessible via platforms such as Google Earth Engine, providing simple raster layers (mean, standard error, quality flags) that plug directly into GIS, modeling pipelines, and gridded analyses (Dubayah et al., 2022b; NASA GEDI Science Team, 2025). This difference reflects the intended users: modelers and calibration scientists for L4A, who need many ancillary fields, versus applied analysts and model integrators for L4B, who benefit from ready-to-use rasters (Healey et al., 2023).

Because L4A contains per-footprint predictions, model inputs, and uncertainty for each shot, it is particularly valuable for calibration, validation, algorithm development, footprint-level error analysis, and linking GEDI measurements to field plots or airborne lidar (Duncanson, Kellner, Armston, et al., 2022; Kellner et al., 2023). L4B implements hybrid, model-based gridding (described in the L4B ATBD and associated publications) to infer 1 km mean AGBD and its standard error, making it the preferred product for regional carbon accounting, CEOS/UNFCCC reporting inputs, global change analyses, and applications requiring wall-to-wall gridded coverage (Healey et al., 2023; Dubayah et al., 2022b).

Finally, the tradeoff between data volume and accessibility is notable. L4A is larger and more complex, with thousands of HDF5 granules containing extensive ancillary fields, increasing storage, I/O, and preprocessing demands, but providing full access to footprint metadata and uncertainty diagnostics (Dubayah et al., 2022a). L4B is compact and user-friendly, with a small number of cloud-optimized GeoTIFFs covering the mission period, lowering storage and processing overhead and enabling easy ingestion into GIS, cloud platforms, or Earth system models (Dubayah et al., 2022b; NASA GEDI Science Team, 2025). In practice, many users run calibration and method development with L4A and then scale analyses using L4B for mapping and reporting (Duncanson, Kellner, Armston, et al., 2022).

The choice between L4A and L4B also depends on the ecosystem type and specific application. L4A is particularly useful in heterogeneous or structurally complex ecosystems, such as tropical rainforests, montane forests, or fragmented landscapes, where fine-scale variations in biomass are critical (Duncanson, Kellner, Armston, et al., 2022). It is also appropriate for high-resolution ecological studies, validation of airborne lidar datasets, and plot-level carbon stock estimation (Kellner et al., 2023). L4B is better suited for relatively homogeneous ecosystems, such as boreal forests or savannas, and for applications that prioritize broad spatial coverage over fine-scale detail, including regional carbon monitoring, ecosystem modeling, and national-scale reporting (Healey et al., 2023; Dubayah et al., 2022b). In practice, many researchers combine both products: L4A for calibration and method development at sample sites, and L4B to scale estimates across landscapes or continents (NASA GEDI Science Team, 2025).

L4A and L4B Comparison Summary Table:

Feature	L4A (Footprint-Level)	L4B (Gridded)
Resolution	~25 m laser footprints	1 km × 1 km grid cells
Format	HDF5 files with multiple folders (metadata, beams, land cover, biomass estimates, uncertainty)	Cloud-optimized GeoTIFFs with simple raster layers (MU, SE, QF, etc.)
Best For	Plot/site-level studies Linking to field plots Calibration & validation Algorithm testing	Regional/national/global mapping Policy reporting (REDD+, UNFCCC) Landscape-scale analyses Earth system modeling
Key Outputs	Agbd Agbd_se l4_quality_flag	Mean biomass (MU) Standard Error (SE) Quality Flags (QF)
Strengths	Very detailed, diagnostic Includes uncertainty per footprint Links directly to field data	Ready-to-use raster maps Easy integration in GIS/cloud Compact data size
Limitations	Complex, large files Steeper learning curve Not wall-to-wall coverage	Coarser resolution Some local detail lost Relies on L4A inputs & modeling

Recommended rule of thumb:

Use L4A when you need detail and diagnostics.
Use L4B when you need coverage and scalability.

Knowledge Check #12
*Which dataset (L4A vs L4B) would you use for* *plot-level forest inventory? Which dataset (L4A vs L4B) would you use for* national carbon accounting?**

Putting GEDI Biomass Estimations Into Action:

This section details the variables utilized across many biomass related analyses. Recommendations on preparation and quality checking are outlined so you don’t get lost in the details.

Filtering and processing choices matter a lot!

Pair plots demonstrate the impact of filtering in practice.

The matrices below displays pairwise relationships between multiple lidar-derived forest structure variables including canopy height metrics (rh25, rh50, rh75, rh95, rh98), quality flags, sensitivity measures, and various canopy cover and biomass-related parameters. Each cell contains either a scatter plot showing the relationship between two variables (off-diagonal elements) or a histogram/density plot showing the distribution of a single variable (diagonal elements, shown in pink). The plot matrix allows for visual assessment of correlations, distributions, and potential outliers across the full suite of GEDI L4A forest structure metrics, with darker blue points indicating higher data density in the scatter plots. This type of exploratory data visualization is commonly used in remote sensing applications to understand relationships between different forest structural parameters derived from spaceborne lidar measurements.

Figure 2. Comprehensive pair plot matrix of valid GEDI (Global Ecosystem Dynamics Investigation) metrics for AOI (Area of Interest) SPB_AOI_GFDI_L4A.

Figure 3. Comprehensive pair plot matrix of GEDI (Global Ecosystem Dynamics Investigation) metrics for AOI SPB_AOI_GFDI_L4A after data filtering has been applied.

Figure 2 shows all original data, while Figure 3 shows filtered data. Compared to the unfiltered data, this filtered dataset shows cleaner distributions and relationships, with reduced noise and outliers, while maintaining the core patterns between forest structural parameters. The darker blue points in scatter plots indicate higher data density. This quality-controlled visualization allows for improved assessment of correlations and distributions across the suite of GEDI L4A forest structure metrics, facilitating more reliable analysis of spaceborne lidar-derived forest characteristics.

The bimodal sensitivity distribution visible in the pair plots explains which variables are critical for filtering, as it clearly separates high-quality measurements from those with poor ground detection. The AGBD outliers present in the data demonstrate the need for uncertainty-based filtering to remove unreliable biomass estimates that could skew analyses. The distinct beam type clustering patterns validate the use of different processing approaches for power beams versus coverage beams, confirming that these beam types require separate analytical consideration.

Different applications may use stricter or looser filters. Uncertainty and quality control are central, emphasizing that not all GEDI shots are equally reliable. Applying quality flags, beam sensitivity, slope corrections, etc., is necessary to make biomass estimates robust. Ecosystem relevance depends on filtering and aggregation. For example, if you want biomass for “forests only,” you’d apply vegetation cover filters.

**The following are key variables, for L4A, we’ve identified for this lecture (centered around biomass and forest ecosystems) and associated tutorial are:**

Variable	Why it matters for forest ecosystems & biomass estimation
agbd (Above Ground Biomass Density)	Directly used to quantify carbon stocks, forest productivity, and ecosystem differences.
agbd_se (Standard error of AGBD)	Essential for filtering unreliable data, weighting estimates in aggregation, and comparing forests of different densities.
elev_lowestmode	Elevation at the ground return. Elevation influences forest type, species composition, and biomass potential. Allows linking biomass variation to topographic gradients.
l2_quality_flag	Indicates whether the underlying L2 waveform data (height, cover) are good quality.
l4_quality_flag	A primary filter for trustworthy data when mapping or analyzing forest biomass.
sensitivity	Describes how well GEDI detected the ground beneath the canopy.
shot_number	Enables traceability, merging across datasets, and identifying specific samples for validation or reprocessing.
land_cover_data/landsat_treecover	% tree canopy cover from Landsat within footprint. Helps distinguish forest vs. non-forest biomass, stratify analyses by canopy density, and study sparse vs. closed-canopy ecosystems.
lat_lowestmode, lon_lowestmode	Needed to map biomass spatially, link to ecosystems, overlay with forest inventories, and aggregate to regions.
selected_algorithm	Identifies which allometric/machine learning model was used to predict biomass (different models apply in different ecosystems). Important for understanding differences across biomes.
predict_stratum	Determines which model covariance/parameters were applied. Useful for ecosystem-specific comparisons.
land_cover_data/pft_class	Key for distinguishing biomass relationships among forest types.
land_cover_data/leaf_off_flag	Seasonal differences affect waveform structure, which can influence biomass estimates in deciduous forests.
degrade_flag	Important for excluding low-quality measurements.
solar_elevation	Relevant for seasonal/diurnal context in biomass estimation.

The sensitivity histogram from the example AOI shows the typical distribution where most measurements cluster around optimal sensitivity values (~0-500), with some extreme values indicating poor ground detection. This distribution guides filtering thresholds: measurements with very negative sensitivity values often indicate failed ground detection and should be excluded.

Figure 4. The distribution of beam sensitivity values for the SPB_AOI_GEDI_L4A aoi. The histogram displays the frequency distribution of sensitivity measurements.

For analyses centered outside of forest ecosystems, the selection of relevant variables and level of importance may differ. For example, in agricultural areas, `landsat_treecover` is less useful, since agriculture ≠ tree cover; `leaf_off_flag` would also not be very relevant, since crop seasonality differs from deciduous tree leaf-on/off. In mangrove ecosystems, agbd_se becomes key to filtering out uncertain coastal footprints, and elev_lowestmode is crucial as mangroves are tidal. Elevation near sea level affects species distribution and flood regimes. Whereas pft_class may not separate out mangroves and cause misclassification. Both of these examples may require additional external datasets to supplement the GEDI measurements.

Two complementary visualizations below help us understand which variables are most useful for GEDI analysis and why filtering decisions matter critically.

The filter retention heatmap (Figure 5) reveals the practical consequences of quality control decisions by showing what percentage of observations survive different filtering combinations. This analysis demonstrates that strict quality filters can dramatically reduce data availability. Some combinations retain only 15-47% of the original measurements. The heatmap shows that certain filters like “Only High Quality” or “Valid data & High Quality” preserve reasonable sample sizes (around 24-30% retention), while more restrictive combinations that add sensitivity thresholds or nighttime requirements can reduce datasets to unusable levels. Importantly, the differences between “All Beams,” “Power Beams Only,” and coverage beams reveal that beam type significantly affects data retention patterns, with power beams generally providing more robust measurements that survive quality filtering.

The Pearson’s correlation heatmap (Figure 6) complements this by revealing the statistical relationships between variables that justify these filtering strategies. The strong positive correlation (0.96) between agbd and agbd_se confirms that higher biomass areas inherently have greater prediction uncertainty, validating the use of uncertainty-based filtering. The beam-type differences show distinct correlation patterns between power and coverage beams, providing statistical evidence for why beam sensitivity matters in filtering decisions. Variables like `l2_quality_flag` and `sensitivity` show meaningful correlations with biomass estimates, supporting their use as primary filters.

Figure 5. Percentage of observations retained per filter for the SPB_AOI_GEDI_L4A aoi, comparing all beams (left panel) versus power beams only (right panel). The heatmap displays how different quality filters affect data retention across various GEDI variables, with colors ranging from dark (low retention) to bright yellow/white (high retention, approaching 100%). Each row represents a different filter criterion (such as quality flags, sensitivity thresholds, and beam characteristics), while columns show different GEDI variables including canopy height metrics (rh25, rh50, etc.), quality indicators, and forest structure parameters.

Figure 6. Pearson correlation heatmaps for the SPB_AOI_GEDI_L4A dataset, with separate matrices for all beams (left), power beams (center), and coverage beams (right). The correlation coefficients range from -1 to +1, visualized through a color scale where yellow indicates perfect positive correlation (1.0), dark blue/purple represents strong negative correlations, and intermediate colors show varying degrees of correlation strength.

Why Comparative Visualizations Matter:

Together, these figures answer two critical questions: which variables should we filter on (correlation patterns indicate relationships) and what does filtering cost us (retention rates show data loss). The correlation heatmap identifies variables that are statistically justified for filtering, while the retention heatmap quantifies the practical trade-offs of implementing those filters. This dual perspective prevents both under-filtering (which includes unreliable data) and over-filtering (which eliminates too much useful information), enabling researchers to make informed decisions about the optimal balance between data quality and sample size for their specific applications.

Quick Start Recommendations:

For Forest Carbon Mapping:

Essential: agbd, agbd_se, l4_quality_flag, lat_lowestmode, lon_lowestmode
Supporting: landsat_treecover, pft_class

For Change Detection Studies:

Essential: agbd, agbd_se, l4_quality_flag, coordinates, shot_number
Supporting: delta_time, beam (for temporal analysis)

For Uncertainty Analysis:

Essential: agbd, agbd_se, agbd_pi_lower, agbd_pi_upper
Supporting: l4_quality_flag, land cover variables

Knowledge Check #13
*What variable tells you about estimated uncertainty?*

GEDI L4A Data Analysis:

When we compare tree cover with above ground biomass density (AGBD), we begin to see why visualizations are so critical for understanding forest carbon. The maps show AGBD on the left and Landsat-derived tree cover on the right for the example study area. The AGBD map highlights where carbon is most densely stored, with values reaching over 350 Mg/ha in some areas, while the tree cover map shows the proportion of canopy closure up to 100%. The bottom panels focus on the highest biomass areas (≥ 242.2 Mg/ha) and highest tree cover areas (≥ 75%), revealing interesting spatial patterns where data is available. The scatter plot demonstrates the complexity of their relationship, showing a weak positive correlation (r = 0.237) across 1,056 measurements. While there’s a general trend of increasing biomass with tree cover, the relationship is highly variable. Dense canopy does not always guarantee high biomass, as forests may appear intact but still have lower biomass due to age, disturbance, or degradation. Identifying where tree cover and biomass align, and where they diverge, is essential for carbon accounting. This allows us to pinpoint true carbon “hotspots” and avoid overestimates that might come from relying on tree cover alone.

Figure 7. Comprehensive tree cover versus biomass analysis for the SPB_AOI_GEDI_L4A dataset. The top panels show spatial distributions of above ground biomass density and Landsat tree cover across the study area. The middle panels highlight high-value areas: high biomass areas and high tree cover areas. The bottom scatter plot reveals the relationship between Landsat tree cover (x-axis) and above ground biomass density (y-axis) for all measurements. Depending on AOI selection in the tutorial, not all charts may be generated depending on tree cover in the region.

In addition to canopy-biomass relationships, GEDI data also allow us to explore the functional makeup of vegetation within a region. Figure 8 below shows the distribution of Plant Functional Types (PFTs) in the SPB area of interest, with a dominant peak for deciduous broadleaf trees. PFT charts like this give us insight into biodiversity and ecosystem function, highlighting which types of plants dominate the landscape.

For instance, an ecosystem dominated by evergreen needleleaf species will store carbon and cycle nutrients differently than one dominated by deciduous broadleaf trees. By tracking PFT distributions over time, researchers can detect shifts in community composition, such as the encroachment of shrubs into former forested land, which can signal ecological change or land-use pressure. This makes PFT distribution a powerful tool for ecosystem monitoring and conservation planning.

Figure 8. Distribution of Plant Functional Types (PFT) for the GEDI L4A SPB_AOI_GEDI_L4A aoi. Different AOIs may contain different PFT classes.

Seasonal variation provides yet another layer of understanding. By comparing biomass density under leaf-on versus leaf-off conditions, we can assess how forest phenology influences biomass estimates. In deciduous forests, leaf-on periods typically register higher biomass because foliage contributes to canopy structure and light interception, whereas leaf-off conditions result in lower biomass estimates. If we only relied on data collected during the leaf-off season, we might systematically underestimate carbon storage in these ecosystems. Seasonal comparisons are therefore key for monitoring forest health, disturbance recovery, and ensuring accurate integration into carbon cycle models. Even when data show only one condition, such where no leaf-off measurements are present, the absence itself highlights the importance of carefully considering timing in data collection.

Together, these chart types for tree cover versus biomass, plant functional type distributions, and seasonal biomass comparisons demonstrate how GEDI visualizations extend far beyond raw measurements. They provide context, reveal relationships, and support applied discussions in carbon accounting, biodiversity monitoring, and forest management. By interpreting these visualizations side by side, we can move from static datasets to actionable ecological insight.

Another important visualization produced by this workflow is the leaf phenology chart, which compares above ground biomass density under leaf-on and leaf-off conditions. Figure 9 presents mean biomass for each seasonal state as a pair of bars, with error bars indicating statistical uncertainty. The green bar represents the leaf-on condition, while the golden bar represents the leaf-off state. In most deciduous forest areas, the leaf-on condition typically registers higher AGBD values because canopy foliage contributes to biomass structure and modifies the laser return signals. By contrast, leaf-off conditions often result in lower biomass detection, since much of the canopy surface area is missing.

Figure 9. AGBD (Above Ground Biomass Density) differences between leaf-on and leaf-off conditions for the SPB_AOI_GEDI_L4A aoi. Different study areas may also contain Leaf-off areas.

The real value of this seasonal comparison lies in what it tells us about ecosystem dynamics. If the two bars are close together, it may suggest an evergreen-dominated ecosystem where biomass signals are relatively stable year-round. If there is a sharp contrast, we are likely observing deciduous systems where seasonal leaf loss has a measurable effect on biomass estimates. This distinction is critical for both scientific and applied reasons: failing to account for phenology could lead to systematic underestimation of carbon stocks in deciduous forests if only leaf-off data are used. Conversely, observing these seasonal differences provides an opportunity to monitor ecosystem resilience, recovery after disturbance, or climate-driven changes in leaf cycles. In some areas of interest, the chart may only display one bar—for example, only leaf-on observations. This absence is just as informative: it highlights data gaps that need to be considered when interpreting biomass estimates.

Beyond vegetation itself, terrain is another key factor that shapes biomass distribution and influences GEDI measurements. The elevation distribution chart, illustrated in the figure below, shows the frequency of measurements across different elevation bands within an AOI. Figure 10 provides a quick view of whether the dataset is concentrated in lowlands, uplands, or spans a wide elevational gradient. Mean and standard deviation lines are overlaid to highlight central tendencies and variability. Elevation context is vital because topography affects forest structure, species composition, and ultimately biomass density. For example, higher elevations may have stunted growth or different species mixes compared to valleys, influencing both carbon storage and biodiversity. In addition, elevation can introduce measurement challenges—slopes, beam angles, and terrain relief can all bias biomass retrievals if not accounted for. Including elevation distributions ensures that analyses consider these physical gradients and avoid misinterpreting patterns that are actually terrain-driven.

Figure 10. Elevation distribution for the SPB_AOI_GEDI_L4A aoi. The histogram displays the frequency of GEDI measurements across different elevation values.

Taken together, these chart types for tree cover versus biomass density, plant functional type distributions, seasonal leaf phenology comparisons, and elevation distributions, demonstrate how GEDI visualizations extend far beyond raw measurements. They provide context, reveal relationships, and support applied discussions in carbon accounting, biodiversity monitoring, and forest management.

Knowledge Check #14
*True or False: A dense canopy always means high biomass.*

GEDI L4B Data Analysis: Understanding Global Forest Biomass Patterns

The dataset encompasses multiple complementary layers that together paint a complete picture of forest carbon storage and its associated uncertainties. The Mean Unbiased (MU) layer serves as the primary biomass estimate, representing the expected above ground biomass density in megagrams per hectare for each grid cell. Accompanying this central measurement, the Standard Error (SE) layer provides crucial uncertainty information, allowing researchers to understand the reliability of estimates in different regions. The Quality Flag (QF) layer offers additional metadata about data processing conditions and potential limitations, while supplementary layers including Number of Shots (NS), Prediction Error (PE), and various processing indicators provide deeper insights into how the gridded product was constructed from the original laser measurements.

Essential datasets from L4B:

File Name	Description
<>_MU.tif	Mean above ground biomass density (MU): estimated mean AGBD for the 1 km grid cell, including forest and non-forest (MG ha-1)
<>_SE.tif	Standard error (SE): standard error of the mean estimate, combining sampling and modeling uncertainty (MG ha-1)
<>_QF.tif	Quality flag (QF): 0 = Outside GEDI domain 1 = Land surface 2 = Land surface and meets GEDI mission L1 requirement (Percent SE <20% or SE <20 Mg ha⁻¹)

Advanced layers for specialized applications:

File Name	Description
<>_V1.tif	Variance component 1 (V1): uncertainty in the estimate of mean biomass due to the field-to-GEDI model used in L4A.
<>_V2.tif	Variance component 2 (V2): If Mode of Inference = 1 → uncertainty due to GEDI’s sampling of the 1 km cell. If Mode of Inference = 2 → uncertainty from the wall-to-wall model calibrated with L4A footprints.
<>_PE.tif	Percent error (PE): standard error as a fraction of the estimated mean AGBD. Values > 100% truncated to 100 (%).
<>_NC.tif	Number of clusters (NC): number of unique GEDI ground tracks with at least one high-quality waveform intersecting the cell.
<>_NS.tif	Number of samples (NS): total number of high-quality waveforms across all ground tracks within the cell.
<>_PS.tif	Prediction stratum (PS): determined by plant functional type and continent; links to L4A model parameter covariance matrix.
<>_MI.tif	Mode of inference (MI): method used for the cell 0 = None applied 1 = Hybrid Model-Based 2 = Generalized Hierarchical Model-Based.

Revealing Global Biomass Patterns Through Visualization

When we examine the latitudinal distribution of mean unbiased biomass density across the globe, a striking and ecologically meaningful pattern emerges that fundamentally reflects Earth’s climate zones and vegetation distributions. Figure 11 reveals that tropical latitudes, particularly the zone between approximately 10 degrees north and 10 degrees south of the equator, consistently exhibit the highest biomass densities. This tropical biomass peak is not merely a statistical artifact but represents one of the most important carbon storage regions on our planet, where dense tropical rainforests of the Amazon, Congo Basin, and Southeast Asian archipelagos concentrate vast amounts of carbon in their towering canopies and complex forest structures.

Figure 11. This figure displays the GEDI L4B Mean Unbiased Biomass (MU) summary by latitude, showing the global distribution of above ground biomass density (AGBD) in Mg/ha across different latitudinal zones.

Moving away from the equator, the data tells a compelling story of how climate constraints shape forest biomass accumulation. The mid-latitude regions, spanning roughly from 20 to 40 degrees in both hemispheres, show markedly lower biomass densities that reflect the transition from dense tropical forests to more open woodland systems, temperate deciduous forests, and agricultural landscapes. These regions, while still supporting significant forest cover in many areas, demonstrate how seasonal climate variations, different precipitation patterns, and human land use activities create more heterogeneous biomass distributions compared to the consistently high values found in tropical zones.

Perhaps most striking in the latitudinal analysis is the dramatic decline in biomass density at high latitudes above 50 degrees, where the data reveals the fundamental constraints that cold temperatures and short growing seasons place on forest productivity. Here, the boreal forests that ring the northern continents, while extensive in area, support much lower biomass densities due to slower growth rates, smaller tree sizes, and the dominance of coniferous species adapted to harsh winter conditions. Beyond the tree line, tundra ecosystems contribute minimal biomass to the global carbon budget, appearing as near-zero values in the GEDI measurements.

Spatial Analysis and Regional Focusing

Data Processing and Quality Considerations

The GEDI L4B workflow incorporates sophisticated data quality management through multiple complementary layers that provide essential context for interpreting biomass estimates. The Standard Error (SE) layer provides quantitative estimates of uncertainty that vary spatially based on factors such as terrain complexity, vegetation structure, and the density of underlying GEDI footprint measurements. Regions with steep topography or sparse GEDI coverage naturally exhibit higher uncertainty, while areas with dense measurement coverage and relatively simple terrain show more reliable estimates. Figures 12 and 13 visualize mean unbiased biomass density and the corresponding standard error for both a national scale and smaller aoi.

The Quality Flag (QF) layer adds another dimension to uncertainty assessment by identifying pixels that may be affected by specific processing limitations or environmental conditions. Understanding these quality indicators is crucial for proper interpretation of biomass patterns, as apparent spatial variations in biomass density might sometimes reflect differences in measurement quality rather than actual ecological patterns. This emphasis on uncertainty quantification represents a significant advancement over earlier biomass mapping efforts and reflects the growing recognition that policy-relevant carbon assessments must include realistic estimates of measurement precision.

Figure 12. National scale example (over Belize) of GEDI L4B biomass, showing two complementary maps. The left panel displays Mean Unbiased (MU) biomass density in Mg/ha, the right panel shows the corresponding Standard Error (SE) values in Mg/ha.

Figure 13. SPB_AOI_GEDI_L4A aoi example of GEDI L4B biomass, showing two complementary maps. The left panel displays Mean Unbiased (MU) biomass density in Mg/ha, the right panel shows the corresponding Standard Error (SE) values in Mg/ha.

Implications for Carbon Cycle Science and Policy

The global biomass patterns revealed through GEDI L4B analysis carry profound implications for understanding Earth’s carbon cycle and informing climate policy decisions. The concentration of biomass in tropical regions highlighted by the latitudinal analysis underscores why tropical deforestation has such outsized impacts on global carbon emissions and why international efforts like REDD+ (Reducing Emissions from Deforestation and forest Degradation) focus heavily on tropical forest conservation.

The detailed spatial resolution of GEDI L4B enables more precise carbon accounting than was previously possible, supporting improved greenhouse gas inventories and more accurate projections of how land use changes might affect atmospheric carbon concentrations. For forest managers and conservation planners, the dataset provides an unprecedented baseline for assessing current carbon stocks and monitoring changes over time. The ability to examine biomass distributions at the landscape scale supports more strategic placement of conservation efforts and more informed decisions about sustainable forest management practices.

Looking forward, the temporal dimension that will emerge as GEDI continues its mission will enable researchers to directly observe changes in forest biomass over time, providing crucial validation data for ecosystem models and early warning systems for forest degradation. This combination of spatial detail, global coverage, and temporal monitoring capability positions GEDI L4B as a transformative dataset for carbon cycle science and environmental policy in the coming decades.

Dataset Name	Description
GEDI L4B Country-Level Summaries of Above Ground Biomass	Country-level aggregated AGBD, total AGB stocks, and standard errors from GEDI L4B.
Pantropical Forest Height and Biomass (GEDI + TanDEM-X Fusion)	25 m & 100 m canopy height and biomass maps across pantropical regions (Amazon, Gabon, Mexico, French Guiana), including uncertainty and disturbance layers.
Gridded GEDI Vegetation Structure Metrics & Biomass Density (Multi-Resolution)	Global gridded vegetation structure and biomass metrics (1 km, 6 km, 12 km), with summary statistics (mean, median, SE, quartiles, Shannon index, etc.).
GEDI L3 Gridded Land Surface Metrics (V2)	Canopy height and surface structure metrics derived from GEDI L2 data in a gridded product.
GEDI–FIA Fusion: Training Lidar Models to Estimate Forest Attributes	Fusion of GEDI lidar footprints with U.S. Forest Inventory and Analysis (FIA) plots for model calibration and validation.
GEDI L4C Footprint-Level Structural Complexity Index (V2)	Provides waveform-based indices of canopy structural complexity at the footprint level.
Global Forest Above Ground Carbon Stocks and Fluxes (GEDI + ICESat-2, 2018–2021)	Gridded products of forest carbon stocks and fluxes combining GEDI and ICESat-2.
Above Ground Biomass Density for High-Latitude Forests (ICESat-2, 2020)	Biomass estimates for boreal and Arctic forests, complementing GEDI’s lower-latitude coverage.

Take-Home Messages

GEDI provides a global, consistent lidar biomass dataset.

Unlike optical or radar sensors, GEDI’s laser pulses directly capture forest height and structure, which are essential for calculating above ground biomass. This global consistency means researchers, policymakers, and conservationists can now compare biomass across ecosystems and continents using the same reference dataset.

L4 products enable scaling from local to global applications.

The GEDI L4A and L4B products complement one another: L4A provides detailed, footprint-level (~25 m) biomass estimates suitable for plot-scale studies, calibration with field data, and fine-scale ecological analyses. L4B aggregates these measurements into 1 km gridded maps, enabling regional-to-global biomass assessments, national carbon inventories, and integration into climate models. Together, they allow users to zoom seamlessly from a single forest stand to global carbon accounting frameworks.

Always interpret with uncertainties + limitations.

Every biomass estimate comes with uncertainty — GEDI provides standard errors, prediction intervals, and quality flags precisely so users know which data are reliable.

Essential stepping stone toward integrated biomass monitoring.

GEDI alone does not provide wall-to-wall coverage, but it anchors global biomass monitoring by delivering calibrated, lidar-based measurements at scale. In combination with other satellite missions (such as ICESat-2, TanDEM-X, Landsat, and Sentinel), one can begin to build a comprehensive understanding of biomass.

Knowledge Check #15
*What makes GEDI unique compared to optical or radar-based biomass datasets?*

SERVIR Carbon Pilot (S-CAP): Global Biomass Product Investigation

Global biomass monitoring isn’t just about one dataset, it’s about comparing and combining many. GEDI plays a critical role in anchoring these ensemble approaches to ground-truth-like lidar observations

The SERVIR Carbon Pilot (S-CAP) was a USAID & NASA initiative that provides an accessible platform for comparing, analyzing, and applying multiple global, regional, and national above ground biomass (AGB) and land-cover datasets in order to support national decision-making on carbon monitoring. Through its web-based interface, S-CAP bundles an ensemble of biomass products, harmonizes them to common formats, and enables users to calculate carbon stocks and emissions using transparent and IPCC-compatible methods.

The central aim of the pilot is to understand how and why global biomass products disagree, to provide guidance on dataset selection or integration, and to improve reproducibility and usability for reporting frameworks such as REDD+ (SERVIR, 2024; Cherrington et al., 2024).

Source: S-CAP Workflow. SERVIR, 2024.

A key finding of the S-CAP ensemble analysis is the large spread among biomass maps in terms of both forest area and carbon stock estimates. These differences are driven by inconsistent forest definitions, input data sources (field plots, airborne lidar, remote sensing inputs), and the methodological choices that underpin each map. Importantly, this disagreement has real implications: national-level carbon stock and emissions estimates can shift dramatically depending on which product is used, potentially altering policy-relevant outcomes. S-CAP therefore emphasizes using multiple datasets as an ensemble to represent uncertainty rather than relying on a single “best” product. Where possible, national teams are encouraged to incorporate local or national maps and inventories, which have been shown to improve estimates and reduce uncertainty (Cherrington et al., 2024).

Source: Cherrington et al., 2024.

GEDI’s biomass products are particularly important in this context. The footprint-level GEDI L4A product (~25 m) provides highly detailed, sample-based biomass estimates that are useful for calibration, validation, and diagnosing systematic biases in global maps. The gridded GEDI L4B product (1 km) is directly comparable to other global biomass layers and is often used in S-CAP’s ensemble comparisons and country-level summaries. Together, these datasets offer both fine-scale diagnostics and large-scale compatibility, reinforcing S-CAP’s goal of transparent and scalable biomass accounting. However, GEDI’s sampling design must be accounted for carefully; because GEDI does not provide wall-to-wall coverage, S-CAP workflows recommend stratified sampling and appropriate upscaling to avoid biased national comparisons.

Ultimately, S-CAP demonstrates that ensemble approaches, harmonized definitions, and transparent uncertainty reporting are essential for global biomass monitoring. GEDI plays a dual role in this system: L4A provides the ground-truth-like validation backbone, while L4B offers a standardized, gridded layer for integration with other global maps. By situating GEDI within this ensemble framework, S-CAP underscores how sampling missions can anchor and improve wall-to-wall biomass estimates, helping countries make better-informed decisions for climate reporting, land management, and carbon accounting (SERVIR, 2024; Cherrington et al., 2024).

S-CAP Application Example:

**Estimating Historic Carbon Emissions caused by Gold Mining Development throughout Southwest Ghana and the Southern Peruvian Amazon using S-CAP (Evans, et al; submitted)**

This study demonstrates how global biomass products, including GEDI’s L4B dataset, can be integrated into operational monitoring of land-use change and associated carbon emissions, with a focus on mining-driven deforestation in Ghana and Peru. The analysis uses high-resolution land-cover change maps derived from SERVIR services as a spatial foundation, and then overlays biomass data from multiple global products—Xu et al., CCI-Biomass, and GEDI L4B—to estimate above ground biomass loss and resulting carbon emissions. This directly mirrors the ensemble-style workflow promoted by the SERVIR Carbon Pilot, where multiple biomass products are compared, harmonized, and applied to policy-relevant questions.

Source: Regional Application Example Workflow (Evans et al., 2025).

A central finding is that biomass loss and associated emissions estimates vary substantially depending on the product used, highlighting the importance of both resolution and data source.GEDI L4B (1 km) often provides intermediate estimates between the finer-resolution CCI-Biomass (100 m) and coarser Xu (~10 km) products, demonstrating both the tradeoffs and complementarity of these datasets. The figure below, in particular, is powerful in teaching—side-by-side panels show how biomass maps at different resolutions align with mapped mining footprints, clearly illustrating why product choice matters for local-to-regional scale monitoring. Tables 2 and 3 extend this point quantitatively, presenting annual emissions estimates for Ghana and Peru that diverge markedly across products, reinforcing S-CAP’s emphasis on ensemble approaches and explicit uncertainty reporting.

Source: Example of data resolution differences between the AGB source products and mining footprint data Western Region Ghana. The CCI pixels are 100m x 100m, GEDI pixels are 1km x 1km, and the Xu et al., 2021 pixel is 0.1° x 0.1° (approximately 10 km x 12 km). This figure
shows the extent of 1 Xu et al., 2021 pixel. (Evans et al., 2025).

GEDI’s Biomass Estimation Approach

In this section:

What is Biomass?

Biomass is the total mass of living organisms in a given area or ecosystem, often expressed as carbon mass (e.g., Mg C/ha).

Main biomass components:

Biomass in Various Ecosystems:

Biomass in Policy & Reporting:

How GEDI Models Biomass:

Knowing how GEDI turns laser signals into biomass estimates helps us trust the data,

interpret uncertainty, and choose the right variables for our analysis.

Before diving into the biomass estimates themselves, we first need to understand how those estimates were generated:

Equations (Pre-Launch Calibration):

From Equations to Operations:

Input:

Output:

Primary Biomass Outputs

Essential Geolocation Data

Useful Supporting Variables:

Land Cover Context:

Extended Uncertainty Information:

Technical Variables:

Algorithm-Specific Outputs:

Model Parameters & Configuration:

Biomass Algorithm structure:

Model Groups and Types

Statistical Framework

Accessing GEDI’s Biomass Products

GEDI provides a calibrated, consistent global biomass dataset not possible with optical or radar alone through their L4A and L4B products:

Accessing and downloading GEDI L4 data:

L4A Footprint Level Biomass Estimations

GEDI L4A Footprint Level Above Ground Biomass Density, Version 2.1

Product Summary:

Naming Convention:

File Structure:

L4B Gridded Biomass Estimations

GEDI L4B Gridded Above Ground Biomass Density, Version 2.1

Product Summary:

Naming Convention:

Key Differences between L4A and L4B Products

L4A and L4B Comparison Summary Table:

Recommended rule of thumb:

Putting GEDI Biomass Estimations Into Action:

Filtering and processing choices matter a lot!

Pair plots demonstrate the impact of filtering in practice.

The following are key variables, for L4A, we’ve identified for this lecture (centered around biomass and forest ecosystems) and associated tutorial are:

Two complementary visualizations below help us understand which variables are most useful for GEDI analysis and why filtering decisions matter critically.

Why Comparative Visualizations Matter:

Quick Start Recommendations:

For Forest Carbon Mapping:

For Change Detection Studies:

For Uncertainty Analysis:

GEDI L4A Data Analysis:

GEDI L4B Data Analysis: Understanding Global Forest Biomass Patterns

Essential datasets from L4B:

Advanced layers for specialized applications:

Revealing Global Biomass Patterns Through Visualization

Spatial Analysis and Regional Focusing

Data Processing and Quality Considerations

Implications for Carbon Cycle Science and Policy

Other GEDI Biomass & Biomass Related Datasets

Take-Home Messages

GEDI provides a global, consistent lidar biomass dataset.

L4 products enable scaling from local to global applications.

Always interpret with uncertainties + limitations.

Essential stepping stone toward integrated biomass monitoring.

SERVIR Carbon Pilot (S-CAP): Global Biomass Product Investigation

Global biomass monitoring isn’t just about one dataset, it’s about comparing and combining many. GEDI plays a critical role in anchoring these ensemble approaches to ground-truth-like lidar observations

S-CAP Application Example:

Estimating Historic Carbon Emissions caused by Gold Mining Development throughout Southwest Ghana and the Southern Peruvian Amazon using S-CAP (Evans, et al; submitted)

**The following are key variables, for L4A, we’ve identified for this lecture (centered around biomass and forest ecosystems) and associated tutorial are:**

**Estimating Historic Carbon Emissions caused by Gold Mining Development throughout Southwest Ghana and the Southern Peruvian Amazon using S-CAP (Evans, et al; submitted)**