From Download to Discovery: Accessing and Analyzing GEDI L2B Vegetation and Structure Metrics
In this section:
- Review what the guided script resources developed in this training series have to offer users compared to other solutions.
- Module 1 has one waveform based colab notebook, Module 2 has two vegetation structure based notebooks and one biomass products notebook.
- Descriptions of Sewanee Domain Tennessee, Paint Rock Alabama, Conecuh Forest, Alabama, and Double Springs, Alabama study areas for real-world context to the hands-on exercises.
- Tutorial 1: Exploring Forest Structure with GEDI L2B in the Southeast
- Tutorial 2: Comparing L2B PAI with High-Resolution Lidar in the Sewanee Domain, Tennessee
Why use these scripts compared to other solutions?
As previous sections have discussed, there is a lot to consider when selecting a GEDI metric for a specific time, location, and need. In module 1, a set of hands-on notebooks guided users through accessing and analyzing GEDI waveform data. In module 2, two vegetation structure and airborne lidar correlation based notebooks and one biomass products notebook guide analysis of L2B, L4A, and L4B. Module 3 details the OBIWAN API and application while Module 4 provides R and GEE scripts working with L2A and GEDI Simulator comparisons.The goal of the set of 2 tutorials is to provide a template for exploring several processing, quality filtering, and validation techniques over a set of localized case studies.
The presented workflow:
- Facilitates accessing and applying GEDI data
- Demonstrates how to analyze potential challenges specific to the user’s need
- Discusses how to interpret the behavior of the data itself.
The goal of this method is to build the user’s confidence in the quality of the data they generate, and in their ability to evaluate its performance when applied to the user’s specific needs.
With these tutorials, the user can easily swap and replace data inputs from NASA EarthData and easily convert them from HDF5 to a more usable format. Time period, selected variables, and study area extents can be quickly customized. The tutorial collects all available data and can do so for multiple AOIs at once if desired, with some limitations on the maximum size of an AOI over specific time periods.
This tutorial does not go into depth on many techniques found in literature that are highly specific to certain contexts and methods, such as:
- Algorithm Setting Group alternative results for each metric
- Integration with ancillary data
- Slope filtering
- Land cover or vegetation type masking
- Heterogenous footprint edge effects
- Footprint buffering during sampling
- Cloud outlier removals
- Continuity filters
- Optical imagery-derived thresholds
- Statistical outlier removals or post-modeling variables of importance
- Extraneous results cleaning
The tutorial is meant to assist users in evaluating GEDI’s potential for their own use case(s/areas of interest). Users are encouraged to further build upon the provided script and undertake techniques and comparisons, such as those listed above, to determine the most appropriate dataset for their analyses. Further future technical developments could improve data access and transformation pipelines, as well as optimizing larger scale cumulative explorations with gridded approaches.
Why Google Colab? The Advantages and Limitations of this Tutorial
There are other existing solutions that provide a broad range of options for access and analysis, each with their own advantages and disadvantages:
| Our solution by comparison | NASA EarthData Access GUI | NASA DAAC Scripts | rGEDI | SlideRule Client | ||
|---|---|---|---|---|---|---|
| Data Access and Visualization | Advantage | Direct access and user defined spatial and temporal extents with incorporated calculated summaries post-download. | Direct access and great visualization of available data files and spatial extents over user defined spatial and temporal extents pre-download. | Direct access available and incorporates calculated summaries instead. | Direct access available and incorporates calculated summaries and modular functions. | The GUI itself directly accesses and visualizes the available data files over the user-defined spatial and temporal extents pre-download. |
| Data Access and Visualization | Limitations | No GUI available to visualize search results or available file extents. No data summaries pre-download. | Direct access but does not visualize the actual datasets within the files. | No GUI available to visualize search results or available file extents. The currently developed code is limited to beam specific visualizations, and requires users to further develop given functions for cumulative summaries and visualizations. | No GUI available to visualize search results or available file extents. The currently developed code includes a mixture of beam specific and cumulative visualizations, and requires users to further develop given functions for cumulative summaries and visualizations. | Detailed visualization options including interactive 3-dimensional spaces for preliminary exploration. |
| Access Preferences | Advantage | EarthData specific access pipelines, including the Cloud. The python and Google colab platform facilitates connection to multiple access methods: EarthData, Cloud services (AWS), R, Google Earth Engine, etc. | User friendly download or Cloud optimized and bulk download. | EarthData specific access pipelines, including the Cloud. The python and Google colab platform facilitates connection to multiple access methods: EarthData, Cloud services (AWS), R, Google Earth Engine, etc. | EarthData specific access pipelines, including the Cloud. The python and Google colab platform facilitates connection to multiple access methods: EarthData, Cloud services (AWS), R, Google Earth Engine, etc. | User friendly download into multiple formats. |
| Access Preferences | Limitations | Also only connected to the EarthData access pipeline using EarthData and Harmony API. | Currently full file download only, no further customization of file subsetting. If users don’t use the Cloud, download is more manual. | Currently connected to EarthData access pipelines only. | Currently connected to EarthData access pipelines only. | Some functionalities are under development. |
| Data product customization | Advantage | Hyper customization of spatial, temporal, and specific attribute selection for all available data over multiple AOIs at once. | Customize spatial and temporal subsetting. | Hyper customization of spatial, temporal, and specific attribute selection over a single AOI per beam. | Hyper customization of spatial, temporal, and specific attribute selection using modular functions. | Hyper customization of spatial, temporal, and specific attribute selection with user friendly selection processes and in-house processing. |
| Data product customization | Limitations | Size and computational limits may need to be mitigated. All relevant Harmony API functionalities are not fully developed. Further customization requires users to further development of given functions for cumulative summaries and visualizations. | No further customization of file subsetting. Additional functionalities are under development. | Beam specific customization in many functions, and requires users to further develop given functions for cumulative summaries and visualizations. | Further customization requires users to further develop given functions for cumulative summaries and visualizations. | Extreme size and computational limits are not easily mitigated given the GUI limitations and advanced coding experience needed to adapt backend in C language. |
| Usability and Required Skills | Advantage | Data summaries, visualizations, and initial processing are developed and implemented for all available data over user defined spatial and temporal extents with data customizations. | User friendly communication, search, and visualization. | Exact summaries of data available over the subset are calculated. | Exact summaries of data available over the subset are calculated. | Very user friendly visualization, summary tables, and interactive figures. |
| Usability and Required Skills | Limitations | Intermediate coding experience required to facilitate exploration of products beyond the provided click-through examples. | Exact summaries of data available over the subset are not calculable within the platform. | Intermediate to advanced coding experience is required to adapt existing functionalities to multiple beam exploration and application of more complex processing techniques. | Intermediate to advanced coding experience is required to adapt existing functionalities to multiple extents and application of more complex processing techniques. | Previous knowledge of data available, acronyms, and presented processes required. |
| Flexible Workflow Plug-ins | Advantage | 1) Converts HDF5.h5 files to other formats and manipulates the data in user-friendly formats. 2) Facilitates the entire workflow process from access, to summary, to visualization, to statistical analysis and other lidar validation. Additionally, the python/colab platform facilitates other workflow plug-ins. | 1) Works with and downloads the original HDF5 files. | 1) Works with and downloads the original HDF5 files and converts to other data formats. | 1) Works with and downloads the original HDF5 files and converts to other data formats. | 1) Users do not need to worry about data format in the GUI. Download allows conversion to other data formats. 2) Serves for data availability, visualization, and customizing datasets prior to download within the in-house GUI. |
| Flexible Workflow Plug-ins | Limitations | 1) Requires temporary space for data conversions which may experience limitations with increased total dataset size. Requires coding experience to customize functions. 2) Further development needed to optimize workflow plug-ins. | 1) Only works with and downloads the original HDF5 files. 2) Serves for data availability and access data steps only. | 1) Most of the functionalities are implemented for the HDF5files. Requires temporary space for data conversions which may experience limitations with increased total dataset size. 2) Further development needed to optimize workflow plug-ins. | 1) Many of the functionalities are implemented for the HDF5 files. Requires coding experience to customize functions. 2) Further development needed to optimize workflow plug-ins. | 1) Size limits require coding based or manual workarounds for application plug-ins. 2) Some functionalities are under development. |
Application Case Studies
The two L2B exercises and one L4A/L4B exercise in the next part of this training, explore highly managed timberlands, conservation and restoration areas in the Deep South of the U.S., demonstrating a range of forest characteristics and disturbance types dictating the behavior of the height, structure, and biomass metrics derived from the returned waveform data. The final filtering strategy takes these forest characteristics into account when generating the output dataset, and the workflow is adaptable to any user defined study period.
Conservation Management: Paint Rock Forest Research Center, Jackson County, Northeastern Alabama
The Paint Rock Forest Research Center leads this collaborative forest research effort, partnering with Alabama A\&M University (Dr. Dawn Lemke), The Nature Conservancy, the Smithsonian Institution’s ForestGEO network, and multiple universities such as University of Alabama, Jacksonville State University, University of Georgia, and Yale University. The project specifically targets collection of training data within historically marginalized communities and the broader southeastern U.S. scientific community studying forest dynamics and climate resilience. Stakeholders include regional forestry managers, conservation organizations, and communities dependent on forest resources in Alabama and across the Southeast.
Paint Rock represents one of North America’s most diverse forest ecosystems and hosts a 60-hectare (150-acre) ForestGEO research plot with over 85,000 individually mapped, measured, and monitored trees across 73+ species. Research encompasses forest growth patterns, carbon storage , species-specific responses to environmental gradients, and genetic diversity. Research partners are interested in expanding this research to topics such as restoration of shortleaf pine ecosystems, vernal pool mapping, and ecosystem response to disturbances like tornadoes. The site serves as a model for combining conservation goals with scientific research and provides baseline data for climate change impact studies.
Source: Paint Rock Forest Research Center
Located within The Nature Conservancy’s Sharp Bingham Mountain Preserve (coordinates: 34.772°N, -86.306°W, Paint Rock sits in the Southern Cumberland region—one of Eastern North America’s most biologically diverse areas. The landscape features a deeply dissected karst topography with widely varying elevation, creating diverse microhabitats from dry ridges to moist valleys. The site follows a 5-year re-census protocol as part of a planned 50-year monitoring program, with the first 50 acres beginning re-census in 2024.
Current challenges include understanding how diverse temperate forests will respond to climate change, invasive species (emerald ash borer, beech leaf nematode), and extreme weather events. Critical knowledge gaps exist regarding which forest structures and species compositions best promote climate resilience, how genetic diversity and distribution within species affects survival, and how forest management practices like controlled burning and selective thinning can enhance ecosystem resilience. The Paint Rock ecosystem serves as a climate refuge, and understanding its dynamics is crucial for developing forest management strategies across the Southeast. GEDI’s capabilities to provide consistent, landscape-scale forest structure data would complement the intensive ground-based measurements, enabling better understanding of how topographic complexity and forest density patterns contribute to the site’s exceptional biodiversity and resilience over time.
What GEDI Could Offer Paint Rock:
GEDI data can address these challenges by providing consistent forest canopy height, vertical structure, and biomass estimates across the complex topographies, offering an alternative to labor-intensive manual sampling. GEDI’s temporal consistency is crucial for addressing Paint Rock’s core research questions about forest resilience—what structural characteristics have enabled this ecosystem’s historic resilience and what will sustain it through future climate shocks. GEDI data can help identify optimal forest density patterns for different species, monitor post-disturbance recovery (like after the 2025 tornado), and guide restoration efforts for shortleaf pine and other species by mapping canopy gaps and optimal planting locations. The technology can reveal relationships between varying karst topography and plant diversity patterns, potentially showing how cave systems influence forest structure and resilience. For conservation management, GEDI can assist in mapping ephemeral wetlands and vernal pools—legally protected habitats requiring buffer zones—more efficiently than current manual methods, while also informing prescribed burn planning through fuel load and canopy continuity mapping. The combination of GEDI with intensive ground-truth data can scale up findings from the 60-hectare plot to the broader 450,000-acre Paint Rock ecosystem, serving as an excellent validation site for GEDI algorithms in complex topographies while providing training opportunities for students learning geospatial analysis.
Southern Pine Beetle Infestation: Double Springs, West Central Alabama
The most destructive forest pest in Alabama and across the Southeastern United States is the southern pine beetle (Dendroctonus frontalis). Although native and ever-present in the region, the expansion of forestry plantations composed largely of highly susceptible loblolly pine can lead to intensified outbreaks, especially in dense and unhealthy old stands. Due to their large negative impact on forestry, the U.S. Forest Service (USFS) and the Alabama Forestry Commission (AFC) collaborate to monitor, prevent, and respond to southern pine beetle infestations. Infestations often begin with beetles attacking weakened trees and then spreading rapidly to neighboring trees, potentially spreading across wide swaths of forest in a matter of days or weeks. The first signs of infestation are often pine needles turning red and then brown, indicating the decline and eventual death of the tree. The presence of southern pine beetles can be confirmed by the presence of pitch tubes along the trunk and s-shaped galleries burrowed under the bark.
Visualize Southern Pine Beetle Infestation Over Time
Source: PlanetScope imagery GIF (June - Sept 2024) showing infestation around August 2024 in Double Springs, AL, USA. By Jacob Abramowitz.
The study area consists of pine forest in Double Springs, Alabama near the southern extent of Bankhead National Forest. Expanding brown spots can be seen across the AOI beginning around August 2024. This timing aligns closely with point data from the AFC Insect and Disease Map, and previous activity in the area has been observed by USFS Insect & Disease Detection Surveys. These surveys rely on human observation and aerial surveys. Spaceborne remote sensing offers an opportunity to both scale up this observation as well as use different sensors, such as GEDI.
What GEDI Could Offer Pine Beetle Infested Areas:
While optical data is often used to detect southern pine beetle infestations, GEDI may offer information on pre- and post-infestation forest conditions. For example, canopy height and plant area index can offer important insights into age and health of a pine stand. The vertical profile of PAI and PAVD may give insight on overstocked areas more susceptible to infestation. Derived metrics like biomass (along with canopy height) can also provide useful information to land managers on high-value timber stands to protect from nearby infestations. Post-infestation, we may see changes in vertical profile metrics of PAI and PAVD, as well as FHD, due to defoliation, although the changes might not be large enough to produce a clear signal.
Prescribed Fires: Conecuh National Forest, Southern Alabama
Prescribed fire has long been used in the Southeast, first by Indigenous communities and later by foresters in the 20th century. Fire was recognized as essential for sustaining longleaf pine ecosystems, and today agencies such as U.S. Forest Service (USFS), The Nature Conservancy (TNC), the Alabama Forestry Commission, and the Alabama Forestry Foundation continue to employ or support prescribed fire. Conservation initiatives—including the USDA Longleaf Pine Initiative and The Longleaf Alliance—work with landowners to restore longleaf ecosystems.
Wildland fire encompasses both wildfires and prescribed burns. Prescribed burns are carefully planned, permitted, and supervised by certified managers to achieve multiple goals: reducing hazardous fuels, regenerating forests, maintaining ecosystems, supporting wildlife, and preventing severe wildfires. The prescribed burns require written prescriptions, weather assessments, fire breaks, and approved ignition methods such as drip torches or chemical ignition. Entities like TNC and USFS develop detailed plans and adjust burn frequency based on ecosystem needs—longleaf pine systems typically require fire every two to three years.
Visualize the Prescribed Burn Over Time
Source: PlanetScope imagery GIF (February - March 2023) shows the prescribed burn area by broadcast burn carried out with a drip torch, and intended as a fuels reduction treatment. Approximately 137 acres were estimated to have been treated.
In Conecuh National Forest, seven prescribed burns have been conducted since 2012 under the Conecuh Prescribed Burning Program and the Longleaf Ecosystem Restoration II Project. The most recent burn in the AOI occurred on February 19, 2023. However, while Conecuh benefits from frequent burns, other forests in Alabama face staffing and resource shortages, limiting burn frequency. Many private landowners also lack the training or certification to conduct burns, relying instead on agencies or contractors. This creates backlogs in site visits, implementation, and post-burn monitoring.
Remote sensing tools, such as GEDI, can help address these challenges.
What GEDI Could Offer Fire-Managed Sites:
GEDI provides information on fuels, forest structure, and canopy conditions. Metrics related to elevation, biomass, plant area index, and understory structure can improve planning, indicating fuel loads, fire behavior, and treatment needs. Post-burn, GEDI can help track vegetation recovery, assess effectiveness, and compare prescribed fire impacts with those of wildfires. By expanding monitoring capacity beyond field visits, these data can support more effective and frequent fire management in longleaf ecosystems. Overall, GEDI could help to assess the need or frequency of burning, and to monitor post-burn impacts.
Restoration and Research: The Sewanee Domain, South Central Tennessee
The Sewanee Domain is a 13,000-acre forest area that contains and is operated by the University of the South, collaborating with The Nature Conservancy (TNC) partnership for FSC certification, the Forest Stewards Guild, and Preferred by Nature . Key stakeholders include university students and faculty conducting research, regional forest managers seeking models for sustainable practices, and the broader southeastern U.S. forestry community. The Sewanee Domain impacts local communities through watershed protection (supplying water to the university and town), recreational opportunities), and serves as a demonstration forest for private landowners interested in sustainable timber management practices.
Commercial and agency partnerships include coordination with Tennessee Wildlife Resources Agency and the Tennessee Department of Agriculture’s Division of Forestry for prescribed fire management and wildlife population control. The Sewanee Domain is one of the earliest forests to be involved in conservation planning in the U.S. and is a leader in forest management practices, particularly for similar Cumberland Plateau ecosystems extending through Kentucky, Tennessee, and Alabama.
The Sewanee Domain serves as a living laboratory for ecosystem management research, combining active forest management with comprehensive scientific monitoring. Research encompasses forest ecology, hydrology, geology, biodiversity assessment, climate impacts, and sustainable forestry practices. The SewaneeDomain is studied for its restoration efforts targeting shortleaf pine and mixed oak woodland ecosystems, prescribed fire effects on vegetation dynamics, deer population management impacts on forest regeneration, and long-term forest productivity under changing climate conditions.
The Split Creek Environmental Observatory provides instrumented watershed monitoring with dendrometer data tracking tree growth rates and daily/nightly trunk expansion/contraction patterns, offering insights into plant water use and climate impacts. Research is coordinated through the Office of Environmental Stewardship and Sustainability, which requires mapping and environmental impact assessments for each project. The Domain plays a critical role as a reference site for understanding Cumberland Plateau forest dynamics and testing management techniques that can be applied across similar forest ecosystems.
Source: Focal watersheds in Sewanee (more info on the Headwaters Initiative).
Key geographic features include the Split Creek Watershed, two major lakes (O’Donnell and Jackson), and the distinctive “rock house” cliff formations (culturally significant for Indigenous communities). The Sewanee Domain represents a critical biological transition zone where multiple forest types converge, including oak-hickory forests, mixed mesophytic forests, and remnant shortleaf pine-oak woodlands that historically dominated the Cumberland Plateau but have been reduced from their original extent across the south eastern United States.
Changing patterns of occurrence within this AOI reflect both historical land use impacts and contemporary restoration efforts. The domain has transitioned from early exploitation (1857-1897) through systematic management (1899-present), with recent intensification of restoration activities targeting diminished shortleaf pine communities. Current spatial patterns show a mosaic of management practices: strictly protected areas (HCV forests totaling ~1,370 acres), active restoration zones (7 compartments), prescribed fire rotation areas (16 compartments), and recreational corridors (65 miles of maintained trails), creating an ideal natural experiment for assessing management effects on forest structure across multiple spatial scales.
The primary challenge facing the Sewanee Domain is understanding how active forest restoration techniques affect three-dimensional forest structure and biodiversity in southeastern deciduous forests under climate change. Specific unknowns include: the effects of prescribed burns on canopy structure; success rates of shortleaf pine restoration efforts in terms of vertical forest structure development; and the effects of management interventions on species that require specific woodland structures, such as the northern bobwhite, prairie warbler, and Bachman’s sparrow.
Current monitoring relies heavily on ground-based measurements and traditional forest inventory techniques, creating gaps in understanding landscape-scale structural heterogeneity and vertical complexity changes over time. Stakeholder concerns (identified in the 2021 verification report) about restoration intensity and potential water resource impacts highlight the need for objective, spatially explicit monitoring tools. The Nature Conservancy partnership and Forest Stewards Guild collaboration demonstrate institutional commitment to addressing these challenges, but require remote sensing to validate effectiveness.
What GEDI Could Offer the Sewanee Domain:
The optimal observation period for GEDI analysis would be 2019-2025, coinciding with the current 10-year management plan. This timeframe captures ongoing prescribed fire treatments, shortleaf pine restoration plantings, and oak woodland enhancement projects.
This period is valuable because it represents a transition from historical management approaches to intensified ecosystem restoration efforts. This includes prescribed fires, which have been run by the student-led fire team since 2016. The timing also aligns with comprehensive baseline data availability from instruments at Split Creek Observatory (weather tower, dendrometers active since 2017) and follows the 2018 FSC certification process that established current management protocols. Multi-year GEDI observations during this period can capture both immediate post-fire vegetation responses and longer-term structural changes from restoration plantings and management interventions.
GEDI can address knowledge gaps by providing precise measurements of canopy height, vertical forest structure, and biomass distribution across Sewanee’s diverse management zones. Key applications include: (1) quantifying structural changes in prescribed fire treatment areas versus control sites, enabling assessment of fire effects on canopy complexity and understory development; (2) monitoring shortleaf pine plantation establishment and growth rates through repeat height and cover measurements; (3) mapping forest structural diversity across the Domain to identify optimal management strategies; and (4) validating ground-based dendrometer and forest inventory data with landscape-scale structural measurements.
GEDI can specifically support restoration monitoring by tracking canopy recovery patterns post-fire, quantifying edge effects around management units, and measuring habitat structure for target wildlife species requiring specific canopy configurations. The satellite’s ability to penetrate forest canopies will enable assessment of understory development critical for oak regeneration and provide metrics for restoration effectiveness concerns raised by stakeholders. Integration with the Split Creek Observatory’s ground-based measurements will create a comprehensive multi-scale monitoring system for validating and scaling up restoration outcomes across similar southeastern forest ecosystems.
Hands-On Resource Objectives
What this multi-regional data exploration demonstration offers you:
Easing access to GEDI data:
- Automated spatial, temporal, and variable selection for desired GEDI product levels as opposed to bulk or manual downloads with limited specifications.
- Local and Google colab/temporary on-the-fly processing, storage, data exploration, and processing before final formatting and download.
- Singular platform (Google colab) for multi-technical and scientific process handling such access to data center platforms, APIs, and within script mapping, statistics, and visualizations.
Demonstration of data exploration and processing techniques
- Walk-through a suggested process for taking key data characteristics into account for your application.
- Explore data availability potentially impacted by spatial, temporal, data attribute, and location or use case contextual characteristics.
- Investigate several currently recommended processing techniques with statistics and visualizations.
Exemplifying real-world challenges over selected locations:
- Walk-through a suggested process for scrutinizing your application and the viability of GEDI for your purposes.
- Investigate GEDI availability and characteristic variations across study areas.
- Compare study area characteristics and the relationships to GEDI sensor capabilities.
- Consider potential advantages or challenges to using GEDI over the study area under given contexts.
Detailed coding documentation
- Learn how the code, libraries, and APIs function.
- Understand where specifications and adjustments can be made for your application.
Tutorial 1: Exploring Forest Structure with GEDI L2B in the Southeast
Tutorial 1 Overview
- Setup your environment and directories
- Access, filter and download the raw GEDI HDF5 files:
- Authenticate your EarthData Access login
- Create the Harmony API data product, spatial, and temporal request parameters and download the files
- Subset the desired GEDI variables and convert the data to GeoDataFrame, Shapefile, and CSV formats
- Explore data quality filtering techniques
- Generate the final dataset
- Plot and explore the data
Tutorial 1: README
Walk through the tutorial which includes detailed setup instructions, tips for customizing and adapting the code for different time periods, new study areas, quality filtering, and export options. The colab notebook may repeat information from the introductionREADME, but contains less detailed instructions, contextual documentation, and workflow options.
Now run-through the entire process yourself:
Vegetation Structure GEDI Tutorial
- You can download the file, then upload it to your drive to open from there or,
- Type: “tocolab” after “github” in the url
Tutorial 2: Comparing L2B PAI with High-Resolution Lidar in the Sewanee Domain, Tennessee
Tutorial 2 Overview:
Tutorial 2: README
This notebook conducts a comparison and validation analysis between Plant Area Index (PAI) data from NASA’s Global Ecosystem Dynamics Investigation (GEDI) mission and PAI data derived from high-resolution aerial lidar. The primary goal is to assess how well GEDI’s PAI measurements align with a ground-truth or higher-resolution reference dataset over the Sewanee Domain. The overall workflow involves these key steps:
- Environment Setup: The notebook prepares the working environment by mounting Google Drive for persistent storage and installing all required Python libraries for geospatial analysis, such as geopandas, rasterio, and rasterstats.
- Data Acquisition: It automatically downloads the necessary datasets from a Zenodo repository, which includes a filtered GEDI L2B data file in CSV format and a high-resolution PAI GeoTIFF file derived from lidar.
- Data Processing: The GEDI data, which consists of individual footprint locations (points), is loaded and processed. Circular plots with a 25-meter diameter are created around each GEDI footprint location to represent the area of each observation.
- Data Integration: Using a technique called zonal statistics, the notebook calculates the average high-resolution PAI value from the GeoTIFF file within each of the 25-meter GEDI plots. This allows for a direct comparison between the two different data sources.
- Analysis and Visualization: Finally, the notebook statistically compares the GEDI PAI values with the averaged lidar PAI values. It generates a scatter plot, performs a linear regression, and calculates key metrics like the R-squared value, correlation, and the equation of the trend line to quantify the relationship between the two datasets.