Vegetation Structural Insights From GEDI

In this section:

Descriptions of available canopy cover, plant area index, foliage height diversity, and plant area volume density, and waveform structural complexity index datasets.
Selected review of example analysis-ready and/or decision-making applications of the vegetation structure data.
Suggested applications considerations for pre-processing and using the vegetation structure data.

L2B Canopy Directional Gap Probability

All the vertical profile metrics are derived from the directional gap probabilities which refers to the probability that a laser pulse passes through the canopy without hitting any leaves, branches, or stems. In other words, it is the chance that the laser “sees sky” when looking through the canopy along a certain angle (at GEDI’s < 6° view angle). It describes how “open” the canopy is from GEDI’s viewpoint.

Source: GEDI L2B algorithm workflow showing the relationships between how the vegetation structure metrics are derived from each other or other GEDI inputs and vegetation constants (Tang et al., 2019).

This probability is derived from the relative amount of laser energy that makes it through the canopy and reflects off the ground, compared to the total incoming energy. A high Pgap means many open spaces between crowns (more ground returns), while a low Pgap means a dense canopy (few ground returns). The gap probability is also measured along the vertical profile at certain heights, giving more perspective on whether there are leaves or other vegetation present at that height.

Several assumptions are made when extrapolating vegetation structure with GEDI.

For example, it assumes that tree leaves and branches are randomly arranged (leaf projection coefficient), that tree crowns aren’t clumped together in unusual patterns (set clumping index) since GEDI cannot independently retrieve clumping at the footprint scale, and uses a constant global value for canopy and ground volumetric backscattering coefficients.

Table 1. Both variables are found in the BEAMXXXX/ main group as shown in the L2B Data Dictionary. The Pgap along the profile is found as `pgap_theta_z` where z is the specific height to the ground.

Variable	Name	Definition
`pgap_theta`	Total Pgap(theta)	Estimated Pgap(theta) for the selected L2A algorithm
`pgap_theta_z`	Directional gap probability profile	Directional gap probability profile (pgap_theta_z = DN / 10000)

Advanced researchers looking to reconstruct the vegetation structure metrics may choose to use this variable. The background is given here to provide a brief overview of the theory and assumptions made for the resulting metrics derived from probability gap calculations to help understand their advantages and limitations.

L2B Canopy Cover

Canopy cover is a way of measuring how much of the ground is shaded by the tops of trees when looking straight down from above. It only counts the leaves, branches, and stems of the main canopy layer, not the gaps between tree crowns or the smaller plants growing underneath. The canopy cover metric describes the proportion of the ground surface that is overlain by canopy material when vertically projected from above. It is restricted to foliage, branches, and stems, while excluding both within and between crown gaps, and vegetation under the canopy. GEDI’s L2B canopy cover footprint level metric estimates of canopy cover fraction and vertical canopy structure. GEDI’s near-nadir viewing angle (< 6°) provides canopy fractional cover, which differs conceptually from canopy closure or crown cover. The derivation relates the directional canopy gap probability (Pgap) to vertical canopy structure. Canopy cover is estimated as the complement of the gap fraction in the vertical direction as 1 - Pgap.

The retrieval method leverages Beer’s Law, which describes how vegetation elements attenuate laser pulses according to their projected area and distribution. GEDI waveforms record the distribution of laser energy returns from both canopy and ground surfaces, enabling canopy cover to be computed as a ratio of canopy to total returns. Additionally, the algorithms assume only trees taller than about 4 meters count as canopy so overstory is contributing to the canopy fraction.

As with other GEDI metrics, canopy cover estimations can be influenced by observation conditions from the instrument, environment, surface, or interpretation algorithms.

The footprint size and resolution at 25m is large enough to capture entire canopies of larger trees to ensure enough gaps for ground detection. Smaller gaps within the crowns are likely to be measured in the waveform compared to airborne lidar where gaps smaller than its footprint size may lead to overestimation of canopy cover (Li et al., 2024)
Comparing algorithm setting groups alternatives can provide insight on different interpretations of ground and canopy top locations which may influence canopy cover fraction calculation.
Conditions like dense canopy cover, day vs night observations, and clouds can influence accuracy.
Phenology of course dictates where canopies are present to measure, over forests where leaf-on and off seasons occur.
Beam sensitivity is directly related to canopy cover. GEDI will still need to be able to resolve between canopy gaps to be able to estimate canopy cover.
Geolocation error from complex terrain can complicate the energy interpretation from the ground and understory vegetation making it challenging to separate overstory canopy cover.

This metric is an essential biophysical parameter used in ecological and climatic applications, such as quantifying rainfall interception, regulating solar radiation, modeling soil erosion, assessing wildlife habitat, and evaluating the effects of climate change and carbon storage. Canopy cover is also typically an input for deriving other vegetation structural parameters, including the Leaf Area Index (LAI) and clumping index, making it an important variable for ecosystem monitoring and modeling. Though this simplification may introduce biases in ecosystems with strong structural heterogeneity, the metric provides robust, standardized, and globally consistent estimates of canopy cover, supporting ecological studies, biodiversity assessments, and climate monitoring at unprecedented scales.

L2B Canopy Cover Profile

Total canopy cover previously described, is a single aggregated value for the observed canopy defined by the proportion of horizontal ground surface covered by the vertical projection of canopy materials within the laser footprint (Tang et al., 2019). The canopy cover profile metrics on the other hand, measure the horizontally intercepted canopy materials at a given height within that same footprint. This more detailed three-dimensional view shows the canopy’s density across different heights, at about 5-meter vertical bins/resolution. Understanding such details of light transmittance, foliage distribution, and thinning across the vertical profile can be useful for wildfire modeling and determining forest strata patterns, for example.

Table 2. Both metrics are found in the BEAMXXXX/ main group as shown in the L2B Data Dictionary. The cumulative cover profile is found as `cover_z` where z is the specific height to the ground. `z = 1` refers to cover from 0-5m, `z=2` from 5-10m, etc. `cover` is a singular value, while `cover_z` is an array of values for each of the 30 height bins.

Variable	Name	Definition
`cover`	Total cover	Total canopy cover, defined as the percent of the ground covered by the vertical projection of canopy material
`cover_z`	Cover vertical profile	Cumulative canopy cover from height (z) to ground (z=0) with a vertical step size of dZ, where cover(z > z_max) = 0

L2B Plant Area Index (PAI)

PAI is defined as half of the total plant area per unit ground surface area and considers all above-ground plant components, including leaves, branches, and trunks. It also notably does not consider the effects of leaf clumping as LAI in GEDI’s algorithm design, and is closely related to canopy cover. With PAI direct quantification of physical characteristics, direct assessment of stand properties can be directly linked to carbon cycle studies.

PAI metrics may have challenging discrimination of short vegetation similar to other metrics where the ground signal may be unresolved, leading to over or under estimation of PAI. Similar to other GEDI metrics, the conditions at observation, selected algorithms, and terrain complexity can influence accuracy.

L2B Plant Area Index Profile

The L2B product provides the vertical PAI profile at 5m vertical resolutions, characterizing how plant area is distributed across different heights within the canopy.

Table 3. Both metrics are found in the BEAMXXXX/ main group as shown in the L2B Data Dictionary. The cumulative PAI profile is found as `pai_z` where z is the specific height to the ground. `z = 1` refers to cover from 0-5m, `z=2` from 5-10m, etc. `pai` is a singular value, while `pai_z` is an array of values for each of the 30 height bins.

Variable	Name	Definition
`pai`	Total Plant Area index m2/m2	Total plant area index
`pai_z`	Plant Area Index profile m2/m2	Vertical PAI profile from canopy height (z) to ground (z=0) with a vertical step size of dZ, where cover(z > z_max) = 0

Why Plant Area and not Leaf Area?

Users may be familiar with the leaf area index in comparison to plant area index. One definition of LAI describes the calculation as one half of the total leaf area per unit ground surface. By comparison, PAI, while closely related due to its inclusion of capturing leaves, interprets all structural elements including branches and trunks as a one sided area of all plant material per unit ground surface. More specifically, due to the limitations of the laser pulse design, GEDI cannot directly measure leaf clumping conditions or leaf angle distributions, and therefore assumes random uniform angular distribution of canopy scattering and constant leaf angle with height (Tang et al., 2019). There is also a great lack of LAI reference data to validate conversions of LAI to PAI requiring specific references for leaf clumping and angle distributions.
However, it has been found that there is a small difference between LAI and PAI in dense broadleaf forests, LAI being represented by roughly 93% PAI (Tang et al., 2019).

With regards to plant area and volume density metrics relating to derived estimates of structural complexity, biomass, and foliage height diversity, here are several foundational concepts to keep in mind:

There are different optical properties between leaf and woody targets:

Leaf contributions:

Highly seasonal (deciduous forests).
Lower reflectance and more absorption.
Contribute primarily to LAI and most PAI estimates.
More uniform spatial distribution.

Woody contributions:

Persistent year-round.
Higher reflectance and stronger backscatter.
Contribute to PAI, but not LAI.
Clumped spatial distribution (branches, trunks).

Validating accuracies may depend on seasonal considerations:

Gap probability (Pgap) models help determine the canopy’s three-dimensional structure and radiative transfer processes. Pgap is the theoretical basis for GEDI’s vegetation structure metrics. When validating the results of these models without separating influences from leaf-on or leaf-off periods, may generate misleading results:

Leaf-on season: The models are summarizing the effects of the entire plant area, leaf and wood.
Leaf-off season: The same models may show poor performance because it was partially relying on leaf contributions that are no longer present.
Evergreen forests: Wood contributions systematically bias total plant area estimates.
Ancillary data could help distinguish color differences between phenological stages.

When conducting temporal analyses:

Leaf-off measurements essentially provide a wood-only baseline.
Differences between leaf-on and leaf-off can provide information on leaf contributions.
Temporal analyses help validate seasonal dynamics of the datasets and region.

Many ecological applications specifically need leaf area index, not total plant area:

Including woody material into these applications can inflate the index values and reduce its biological meaning within

Photosynthesis modeling
Evapotranspiration estimates
Carbon cycle studies
Phenological monitoring

L2B Plant Area Volume Density (PAVD) Profile

The Plant Area Volume Density (PAVD) dataset is a canopy structural index that quantifies the one-sided plant area per unit volume. It describes the vertical distribution of vegetation density within a canopy and is often used to characterize the vertical structure of forests. While PAI provides the area of plant material present across each height, PAVD helps to understand where the bulk of the plant material is. PAVD is directly derived from the vertical PAI profile of each footprint with the same leaf projection, clumping index, and other vegetation assumptions within GEDI’s algorithm. Quantifying the vertical heterogeneity of foliage is important for understanding habitat quality, fire management, carbon cycle studies, and biodiversity. Some studies have used PAVD to train models to estimate understory vegetation density under phenological differences between overstory and understory vegetation (Xi et al., 2022). Similar to other GEDI metrics, the conditions at observation, selected algorithms, and terrain complexity can influence accuracy.

Table 4. The metric is found in the BEAMXXXX/ main group as shown in the L2B Data Dictionary. The cumulative PAVD profile is found as `pavd_z` where z is the specific height to the ground. `z = 1` refers to cover from 0-5m, `z=2` from 5-10m, etc. and is formatted as an array of values for each of the 30 height bins.

Variable	Name	Definition
`pavd_z`	Plant Area Volume Density m2/m3	Vertical Plant Area Volume Density profile with a vertical step size of dZ

L2B Foliage Height Diversity (FHD)

Foliage Height Diversity (FHD) is a canopy structural index that quantifies the distribution and complexity of vegetation structure through its vertical profile (essentially Shannon’s diversity index). FHD is a crucial biophysical metric particularly important for studies of global environmental change and terrestrial biodiversity and habitat quality. Once the vertical PAI profile is generated for each footprint, FHD is directly calculated from it. High FHD values typically indicate a more complex forest structure with multiple canopy layers, which correlates with tree species diversity (where you would expect broadleafs to have more complex vertical structures than coniferous species). This provides a valuable, large-scale structural parameter that can be used to assess conservation priorities.

Traditionally, FHD measurements are limited to a few plot samplings due to high labor costs. While field inventories and optical images are used for forest diversity estimation, they are often limited by area coverage, weather conditions, high costs, and acquisition time, making it challenging to develop detailed, large-scale maps. Other lidar sources with higher resolutions can offer finer detail for calculating this metric. Another limitation lies in GEDI’s capacity to capture short vegetation which may influence the FHD calculations. Similar to other GEDI metrics, the conditions at observation, selected algorithms, and terrain complexity can influence accuracy.

Table 5. The metric is found in the BEAMXXXX/ main group as shown in the L2B Data Dictionary. FHD is an index with a singular value.

Variable	Name	Definition
`fhd_normal`	Foliage Height Diversity	Foliage height diversity index calculated by vertical foliage profile normalized by total plant area index.

L4C Waveform Structural Complexity (WSCI)

The structural complexity product estimates the 3D waveform structural complexity with four global models to predict complexity based on models specified for four plant functional types (PFT). All the GEDI RH percentiles for each footprint had relationships developed with over 800,000 airborne lidar system point cloud samples globally.

The quality of the L4C dataset is based on the L2A algorithm and quality flag with a few more strict filters to ensure the shots are over tree covered surfaces. Water and urban areas, high sensitivity thresholds and filtering with the ESA worldcover v200 product for tree cover classification were used to ensure quality representation of each footprint.

Table 6. The metric is found in the BEAMXXXX/ main group as shown in the L4C Data Dictionary. WSCI is found as a total index value, as “z” for along the vertical profile, and similar to the other footprint data products, includes the algorithm setting group alternative results with suffix `_aN` for each of the variables listed in the table.

Variable	Definition
`wsci`	Predicted 3D canopy entropy from the corresponding Plant Functional Type (PFT) model
`wsci_pi_lower`	Lower prediction interval at 95% confidence
`wsci_pi_upper`	Upper prediction interval at 95% confidence
`wsci_quality_flag`	Flag simplifying selection of most useful WSCI predictions
`wsci_xy`	Predicted WSCI horizontal term over the XY plane within the footprint
`wsci_xy_pi_lower`	Lower prediction interval at 95% confidence
`wsci_xy_pi_upper`	Upper prediction interval at 95% confidence
`wsci_z`	Predicted WSCI vertical term along the Z axis within the footprint
`wsci_z_pi_lower`	Lower prediction interval at 95% confidence
`wsci_z_pi_upper`	Upper prediction interval at 95% confidence

Interpret the units and range of the WSCI estimates

Since it is an index, WSCI values do not have physically meaningful units, but range from low to high. Interpretations of what falls under high and low values is determined by the local context or when comparing vegetation types of similar structure. Globally for example, < 8 corresponds to barely vegetated areas with little to no structural complexity while >10.5 is high structural complexity in undisturbed or tropical forests. These values and relativity are likely to differ for local contexts and comparisons.

Other GEDI Vegetation Structure Datasets

See the Gridded GEDI Vegetation Structure Metrics and Biomass Density at Multiple Resolutions and L4C WSCI products for more GEDI datasets measuring vegetation structure.

Knowledge Check #10
What are the three major canopy element assumptions made (input parameters) within GEDI’s globally applied algorithm for calculating vegetation structure metrics? What implications does each have on the performance of these metrics over different forest structures, say over a tropical forest versus a timber plantation for example?

Putting GEDI Vegetation Structure Metrics into Action

Fig 1: Diagram quantifying the vegetation structure metrics from GEDI L2B used over a selection of around 50 applications focused publications. The left most nodes show the application and/or desired outputs, while the middle nodes link the exact structural product used over all the associated applications. The right most nodes are the processing techniques applied or not applied to each study using L2B, summarizing across all applications.

Most of the studies reviewed used GEDI’s vegetation structure metrics for mapping vegetation structure, aiding canopy height and above ground biomass studies, and classifying vegetation and fuel types. Other studies regarding functional diversity, biodiversity, structural complexity, archaeological mapping also experimented with L2B metrics. A few examples were found mapping canopy cover, estimating carbon, tracking forest disturbances, crops, and elevation studies. The total plant area index, total canopy cover, and foliage height diversity were the most frequently used. Pgap and the profile metrics, while valuable, require more involved processing due to its format for handling in common analyses. The L4C structural complexity index, is notably a newer product to-date.

Among the most relevant processing applied to these studies involved using both power and coverage beams, quality and degrade flag, and using the default ASG. Notably, any leaf period was frequently used, as well as including observations from any time of day. Most studies did not apply spatial buffering, slope or other terrain filters, vegetation or land cover filters, and any beam sensitivity. Scrutinizing the figure can offer more insight into which structural metrics dominated the findings for the use or non-use of each processing technique by following the colored link and observing its width making up the size of the node. The figure’s quantifications are not meant to read as all encompassing rules or recommendations for how to process the structural metrics for the outlined applications, but rather present a detailed overview of the choices made by similar applied researchers across the literature to serve as a starting point for your consideration.