6  Understanding active fire detection uncertainty with Bayesian Neural Networks

Authors
Affiliations

Julia E. Harvie

Great Lakes Forestry Centre

Natural Resources Canada

Karlee E. Zammit

Northern Forestry Centre

Natural Resources Canada

Brittany T. Engle

Earth Observation for Ecosystem Management

Technical University of Munich

Jacqueline A. Oliver

Great Lakes Forestry Centre

Natural Resources Canada

Lynn M. Johnston

Great Lakes Forestry Centre

Natural Resources Canada

Alan S. Cantin

Great Lakes Forestry Centre

Natural Resources Canada

Morgan A. Crowley

Great Lakes Forestry Centre

Natural Resources Canada

Run in Colab View on GitHub

This tutorial accompanies the work titled “Understanding active fire detection uncertainty with Bayesian Neural Networks”. This notebook is intended to be accompanied by the written document and is not meant to be a standalone notebook. Please reference the written text to find further information and explanations of the steps and plots produced here.

This work was based off of the Keras Bayesian Neural Networks tutorial. Note that this tutorial uses specific versions of packages that are compatible with Keras 2.0, as Keras version 3 (at the time of development) did not support the tensorflow probabilistic layers. The import statements note the correct versions. We also provide a yml file which forces Keras 2 if you would like to run locally. Additionally, the Keras tutorial is set up for regression, and so this notebook altered the tutorial to be compatible for classification tasks.

6.1 Important Notes (Read Before Running)

6.1.1 Python Versioning

This code is only compatible with Python 3.10, which is no longer the default version loaded by Colab. We have included the instructions on how to force the notebook to use Python 3.10, to ensure there are no errors related to version incompatabilties.

6.1.2 Code Order

The code in this tutorial is intended to be run start to finish in order. A user must also read the initial sections to properly load in the data before attempting to “Run All”. Functions are defined in the section where they are first called in, and may be called again during additional sections without being re-defined in those sections. Therefore if you wish to run only a portion of this code make sure to run all the function defining code blocks of previous sections to ensure all required functions have been defined.

6.1.3 Colab vs Jupyter

This tutorial was constructed so that every step can be run within the Google Colab environment to take advantage of the computing resources it provides. We highly recommend users who are new to Python use the option of running in Colab as it negates the need to have Python and the associated packages configured on their local machines. Therefore this is the default option of the code. However, we do recognize that for more advanced users a local workflow may be preferred, so we have included instructions for how to adapt this script to run locally if desired.

6.1.4 How to Run This Notebook on Colab - Default setting


  1. Upload the recommened files using the code block below
  2. Set colab_jupyter_flag to “c”
  3. Hit the “Run all” button located above the script
  4. A prompt will appear asking the user to select a python version. Select version 3.10 and see the notebook sectional labeled ‘Colab Specific Instructions’ for a more detailed explanation

####Files to upload: 1. train-final-raw.csv: this csv file contains raw VIIRS active fire data and contains imbalanced classes. This will be used in Case Study A, Model One to serve as a baseline dataset for this study. 2. train-final-balanced.csv: this is the class-balanced version of the train-final-raw.csv. It is used in Case Stdy A, Model Two to show the impact of balancing classes on model performance and in Case Study A, Model Three as an input for hyperparameter tuning. 3. train-final-balanced-case-study-2.csv: this dataset integrates data from both VIIRS and MODIS. It is used in Case Study B, Model One and Two to help test if fire detection varies based on satellite source.

For more information regarding the datasets, please refer to section 2.2 Data in the corresponding article.

Show code
#Upload the files
from google.colab import files
import os

# Define fixed upload directory
upload_dir = "/content/"

# Create directory if it doesn't exist
os.makedirs(upload_dir, exist_ok=True)

# Prompt user to upload files
uploaded = files.upload()

# Save uploaded files to the fixed directory
for filename, data in uploaded.items():
    filepath = os.path.join(upload_dir, filename)
    with open(filepath, "wb") as f:
        f.write(data)
    print(f"Saved: {filepath}")

print(f"\nAll files saved to: {upload_dir}")

6.1.5 How to Run This Notebook on Local

This notebook can be run locally on your machine IF you have a GPU and are at an intermediate Python programming level, otherwise you will need to use Google Colab. We suggest all users first use the Google Colab as the environment can be tricky to set up on local machines, and it is already set up here for you. The figures and numbers in the accompanying text will match the model files available for preloading with Colab. If you retrain the models (as is necessary within the Jupyter option), you will get slightly different statistics and figures when compared to the text due to some internal randomness in Keras. 1. Download: 1. the provided yml file and install to your local computer 2. the provided csv files that contain the data needed 3. Create your anaconda environment using the yml file 4. Import the anaconda environment into jupyter using these commands in the anaconda prompt: 5. conda install -c anaconda ipykernel 6. python -m ipykernel install –user –name=firstEnv 5. Update the pathway to the directory containing all csv files below 6. Set colab_jupyter_flag to “j”

⭐ Decision point: Would you like to run this code in colab or in a jupyter notebook on your local machine?

Show code
# set this flag to either "c" or "j" for running in jupyter or colab
colab_jupyter_flag = 'c'

# Update data directory to where you have stored the csv files
# Example data_dir = r"C:\Users\kzammit\Documents\DL-chapter\test-20251014"
if colab_jupyter_flag == 'j':
  data_dir = r"UPDATE"

6.2 Frequently Asked Questions

This section contains more in depth explanations regarding the code itself and how to work with it.

How do I upload the supporting files (input data, pre trained models etc) to the Colab enviroment? 1. Once the notebook has loaded on the left hand side of your screen will be a list of icons the bottom being a file that says “file”. Click there to open the Files menu. 2. Identify the directory labeled “content” and drag and drop any files (input data, saved model files etc) needed for the run here. 3. When you first open the menu, content may not be an option, click on the directory labeled “..” to reveal more directories.

Where do I find the file path to assign to the variable “data_dir”? 1. Navigate to directory containing the input files using the file menu to the left. 2. Hover the cursor over the directory you need the path to point to and click the three dots that appear to the right of the directory name. 4. When the drop down menu appears click “Copy path” 5. Navigate to the box where “data_dir” is set. Under the section How to Run this Notebook in Colab. And use the keyboard command ctrl+v to paste wihtin the quotes after the r. Keyboard shortcut MUST be used. Right click drop down will not have a paste option.

Can I use this code with different input datasets than those provided?

Yes! We encourage it. Please refer to section 2.2 of the accompanying chapter for a detailed description on how we generated the datasets that have been provided as the sample datasets.

How long will this code take to run?

In general this code will take around 2 hours to run to completion when run in the Google Colab environment. Run time of local environments will vary based on the computing capabilities of the local machine. With the training of case study A model 3 and case study B model 1 as two of the longer steps at between 30 and 50 minutes. The SHAP figures also can take up to 30 minutes to render.

How to use the “Help Boxes”?

Through out the tutorial are markdown cells title Help Box. By navigating to a Help Box cell you will receive directions with how you can skip the most time intensive steps of the tutorial by loading a pretrained model. As long as the file paths associated with these help boxes are set up correctly and the “read_in_file” variable is set to True. The user can run the code from the beginning as normal. No additional actions are required from the user to skip model training.

Ran out of GPU time?

You may receive an error message from Colab saying you have run out of your allotted GPU time. This notebook does not require connection to a GPU to run. On the top right hand side of the note box is an arrow next to the words RAM and Disk. Clicking on the arrow opens a drop down with the option “Change runtime type”. Opening that menu will show an option for Hardware accelerator. Select the CPU option and hit save. You should no longer receive GPU run time error messages.

Crashed?

Some users may encounter runtime timeouts or crashes near the end of the notebook when using Google Colab. If this occurs, we recommend using the Jupyter version of the notebook or utilizing the provided pretrained model files when working in Colab.

6.2.1 Running this Notebook without Training Models

Although the primary purpose of this code is to train a deep learning model from scratch, we recognize that a user may not wish to retrain their models every time they start a new session. Included in this code are directions for how to export a trained model so that it will be saved after a run is exited, as well as how to modify the code to import in these model files and bypass the code blocks that initiate model training. For convenience we have included sample model files so a user exploring this tutorial for the first time may bypass model training altogether. However, turning off model training is not the recommended way to interact with the tutorial. The outputs generated by Keras as it trains models are informative and offer valuable context to deepen your understanding of the models themselves.

⭐Decision point: Would you like to run this code from “scratch” or with a pretrained model?

Show code
# Loading pretrained model files
# The provided model files are exported from Colab and do not run locally so we have forced
# the flag to be False if running in Jupyter
# Once you have ran once from scratch locally, feel free to switch this to "True" as the exported
# model files are compatible with Jupyter
if colab_jupyter_flag == 'j':
    read_in_file = False
else:
  # In Colab, set this to either True or False
    read_in_file = True
    # Define fixed upload directory
    upload_dir = "/content/"

    # Create directory if it doesn't exist
    os.makedirs(upload_dir, exist_ok=True)

    # Prompt user to upload files
    uploaded = files.upload()

    # Save uploaded files to the fixed directory
    for filename, data in uploaded.items():
        filepath = os.path.join(upload_dir, filename)
        with open(filepath, "wb") as f:
            f.write(data)
        print(f"Saved: {filepath}")

    print(f"\nAll models saved to: {upload_dir}")

6.3 Colab Specific Installations

Do not run these if running locally in a Jupyter notebook.

Show code
# Google colab will update its default version of Python without notice. Use this box to check the current version of Python.
# Included below are the steps to force the notebook back into Python 3.10, the last version confirmed stable for this code.
if colab_jupyter_flag == 'c':
    !python --version
Show code
# Click on the box where it says selection and choose 1 for python 3.10
if colab_jupyter_flag == 'c':
    !update-alternatives --config python3
Show code
# Run this cell to confim the notebook is now running Python 3.10
if colab_jupyter_flag == 'c':
    !python --version
Show code
# Must reinstall pip in the new python
# Note: It is expected to get warning messages from this cell (ex. WARNING: The scripts pip, pip3 and pip3.10 are installed in '/root/.local/bin' which is not on PATH.
# Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.)
if colab_jupyter_flag == 'c':
    !apt-get install python3-pip
    !python3 -m pip install --upgrade pip --user
Show code
# Confirm it worked. Tutorial required python 9 or 10 enviroment set up may fail if python 11+ is being called.
if colab_jupyter_flag == 'c':
    !python --version
    !pip --version
Show code
if colab_jupyter_flag == 'c':
  ! pip install tensorflow_probability==0.24.0
  ! pip install tf_keras
  ! pip install tfp-nightly
  ! pip install tf-nightly
  ! pip install tf-keras-nightly
  ! pip install shap

6.3.1 Import Statements

From here onwards, everything is compatible in both Jupyter and Colab.

Show code
import pandas as pd
import numpy as np
import tensorflow as tf
import tf_keras
import tensorflow_probability as tfp
from keras.utils import plot_model
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
from matplotlib.lines import Line2D
from matplotlib.colors import LinearSegmentedColormap
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, PrecisionRecallDisplay, precision_recall_curve
from scipy.stats import ttest_ind
import itertools
import importlib
import os
import shap
import plotly.graph_objects as go
import random

# Set random seed for reproducibility
tf.random.set_seed(42)
random.seed(42)
np.random.seed(42)

# Suppress tensorflow warnings for deprecated versions
import warnings
warnings.filterwarnings('ignore')
Show code
# Print the version of the nonstandard versions for use in this tutorial (note "dev" in the name)
print("TensorFlow version:", tf.__version__)
print("TensorFlow Probability version:", tfp.__version__)
print("TensorFlow keras version:", tf_keras.__version__)

6.4 Case Study A

Case Study A: Pre-screening VIIRS active fire data to highlight potential false positives using BNN.

This study uses a point prediction BNN model, which outputs a single probability score for each detection, indicating the likelihood it is a true positive. Applying a classification threshold to this score yields a binary indicator (true/false positive), primarily for use in pre-screening VIIRS data to identify and remove likely false positives.

Goal: The goal of the supervised Bayesian neural network developed in Case Study A is to predict whether a hotspot identified by a satellite-based active fire detection product corresponds to an actual fire on the ground. Note that the training and validation loss will be plotted individually after training each model, but all results (loss curves, confusion matrices, etc.) are presented at the end of the case study in the “Results” section.

6.4.1 Define epochs

Show code
# Define the number of epochs to train over for each run
epochs = 30

6.4.2 Model One

Model One closely follows the original Keras tutorial and does not include any hyperparameter tuning. You do not need to run the original Keras tutorial as a prerequiste to this notebook, as we repeat the necessary information here. It uses “raw” input data, defined here as data obtained directly from FIRMS, with no preprocessing beyond the basic steps outlined in Section 2.2. The purpose of this model is to serve as a baseline, demonstrating model performance without additional optimization or customization.

6.4.2.1 Importing and preparing the data

The notebook expects that the data has been downloaded from source and undergone basic processing steps before beginning this tutorial. See Section 2.2 for further details. We used a 80/20 ratio to split the complete input data set into separate training and testing data sets respectively.

Show code
def create_tr_test_data(data_file):
    """
    Split data into training and validation datasets
    :param data_file: file path to csv file, ex. 'train-final-raw.csv' if you've uploaded locally
    :return: X train, X test, y train, y test, feature names (used in model creation)
    """

    # Define raw data file and read
    train_data = pd.read_csv(data_file)

    print(train_data.head())

    # Drop the target variable
    X = train_data.drop(columns=['Class'])

    # Set the target variable
    y = train_data['Class']

    # Create the training and testing datasets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    feature_names = X_train.columns

    X_train = X_train.to_numpy()
    X_test = X_test.to_numpy()
    y_train = y_train.to_numpy()
    y_test = y_test.to_numpy()

    return X_train, X_test, y_train, y_test, feature_names
Show code
# Note: You must add your data as described in the beginning sections:
# "How to Run This Notebook on Local" and "How to Run This Notebook on Colab"
if colab_jupyter_flag == 'j':
  cs1_m1_data = os.path.join(data_dir, 'train-final-raw.csv')
else:
  cs1_m1_data = '/content/train-final-raw.csv'
X_train_c1m1, X_test_c1m1, y_train_c1m1, y_test_c1m1, feature_names_c1m1 = create_tr_test_data(cs1_m1_data)
print("Preview of provided data to confirm sucessful upload")
train_size_c1m1 = X_train_c1m1.shape[0]

6.4.2.2 Create and train model

All of the model training criteria we defined in section 3.3 of the manuscript were set during this section of the code.

Show code
def prior(kernel_size, bias_size, dtype=None):
    """
    Define the prior distribution (not tunable)
    :param kernel_size: defined by model
    :param bias_size: defined by model
    :param dtype: defined by model
    :return: prior model
    """

    n = kernel_size + bias_size
    prior_model = tf_keras.Sequential(
      [
          tfp.layers.DistributionLambda(
              lambda t: tfp.distributions.MultivariateNormalDiag(
                  loc=tf.zeros(n), scale_diag=tf.ones(n)
              )
          )
      ]
    )
    return prior_model
Show code
def posterior(kernel_size, bias_size, dtype=None):
    """
    Define the posterior distribution (tunable)
    :param kernel_size: defined by model
    :param bias_size:
    :param dtype:
    :return: posterior model
    """

    n = kernel_size + bias_size
    posterior_model = tf_keras.Sequential(
      [
          tfp.layers.VariableLayer(
              tfp.layers.MultivariateNormalTriL.params_size(n), dtype=dtype
          ),
          tfp.layers.MultivariateNormalTriL(n),
      ]
    )
    return posterior_model
Show code
def create_model_inputs(feature_names):
    """
    Create inputs of tensorflow float type 32 for model input
    :return: inputs
    """
    inputs = {}
    for feature_name in feature_names:
      inputs[feature_name] = tf_keras.Input(name=feature_name, shape=(1,), dtype=tf.float32)
    return inputs
Show code
def create_bnn_model(train_size, prior_func, feature_names, activation_fun = "relu", unit_dim = 8):
    """
    Create bayesian neural network model
    :param train_size: number of samples in training dataset
    :param prior_func: name of prior function (posterior function never changes in this example)
    :param activation_fun: activation function for hidden layers
    :param unit_dim: number of units in hidden layers
    :return: model
    """

    inputs = create_model_inputs(feature_names)

    # Create one keras tensor for all inputs
    features = tf_keras.layers.concatenate(list(inputs.values()))
    features = tf_keras.layers.BatchNormalization()(features)
    hidden_units = [unit_dim,unit_dim]

    for units in hidden_units:
      features = tfp.layers.DenseVariational(
          units=units,
          make_prior_fn=prior_func,
          make_posterior_fn=posterior,
          kl_weight=1 / train_size,
          activation=activation_fun,
      )(features)

    # 2 for 2 classes
    distribution_params = tf_keras.layers.Dense(units=2)(features)

    outputs = tf_keras.layers.Dense(1, activation="sigmoid")(features)

    model = tf_keras.Model(inputs=inputs, outputs=outputs)
    return model
Show code
def run_experiment(model, loss, metrics, epochs, feature_names, X_train, y_train):
    """
    Run the experiment
    :param model: defined model to train
    :param loss: loss function to use during training
    :param metrics: metrics to output during training
    :param epochs: number of epochs
    :param X_train: training data
    :param y_train: training data labels
    :return: model
    """

    # Compile the model
    model.compile(optimizer='adam', loss=loss, metrics=metrics)

    X_train_format = {str(feature_names[i]): X_train[:, i] for i in range(X_train.shape[1])}

    # Train the model
    history = model.fit(
        x=X_train_format,
        y=y_train,
        batch_size=32,
        epochs=epochs,
        validation_split = 0.2,
        verbose=1
    )

    train_loss = history.history['loss']
    val_loss = history.history['val_loss']

    return model, train_loss, val_loss

6.4.2.3 Help Box - Skip model training

By setting the variable “read_in_file” to True at the beginning of the code, the user activates sections of the code provided to allow the time intensive model training steps to be skipped. It is assumed that if a user is activating these sections of the code they have uploaded the appropriately named files to the same directory that the input data is stored in. File names are set by default to call the sample files that have provided with this tutorial.

Show code
if read_in_file:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model1_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m1.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m1.weights.h5"
    training_output_file = "casestudy1_model1_df.csv"

  df_c1m1 = pd.read_csv(training_output_file)
  reconstructed_model = create_bnn_model(train_size_c1m1, prior, feature_names_c1m1,
                                            activation_fun = 'relu', unit_dim = 8)
  reconstructed_model.load_weights(model_file_name)
  trained_model_c1m1 = reconstructed_model
  train_loss_c1m1 = df_c1m1['train_loss']
  val_loss_c1m1 = df_c1m1['val_loss']
Show code
# Compile an untrained model
model_c1m1 = create_bnn_model(train_size_c1m1, prior, feature_names_c1m1, activation_fun = 'relu', unit_dim = 8)
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  # Train model
  # input order: model, loss, metrics, epochs, X_train, X_val, y_train, y_val
  trained_model_c1m1, train_loss_c1m1, val_loss_c1m1 = run_experiment(model_c1m1, 'binary_crossentropy',
    [['accuracy', tf_keras.metrics.Precision(), tf_keras.metrics.Recall()]], epochs,
                                            feature_names_c1m1, X_train_c1m1, y_train_c1m1)

6.4.2.4 Help Box - Saving model files

By default, once a model has been trained the notebook will export the model file and outputs from its training runs to files saved within the working directory. If the notebook is being run on Colab these files WILL NOT be saved once the Colab run is exited. If a user wishes to retain these files download them to a local directory. If these users does not care about these outputs, they will automatically be deleted when the notebook is closed. If the user is running local on Jupyter the files will have to be deleted by hand.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. Therefor no models files will be exported as they will just be replicates of the loaded files ")
else:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model1_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m1.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m1.weights.h5"
    training_output_file = "casestudy1_model1_df.csv"

  model_c1m1.save_weights(model_file_name)
  df_c1m1 = pd.DataFrame({'train_loss': train_loss_c1m1,
                      'val_loss': val_loss_c1m1})
  df_c1m1.to_csv(training_output_file)
Show code
### Should be a redundant cell. Leaving in for now just incase
# Save gate for experiment 1 outputs
if colab_jupyter_flag == 'j':
  dir_path = os.path.join(data_dir, 'output')
  os.makedirs(dir_path, exist_ok=True)
  if read_in_file:
      df_c1m1 = pd.read_csv(os.path.join(dir_path, "casestudy1_model1_df.csv"))
      reconstructed_model = create_bnn_model(train_size_c1m1, prior, feature_names_c1m1,
                                            activation_fun = 'relu', unit_dim = 8)
      reconstructed_model.load_weights(os.path.join(dir_path, "model_c1m1.weights.h5"))
      trained_model_c1m1 = reconstructed_model
      train_loss_c1m1 = df_c1m1['train_loss']
      val_loss_c1m1 = df_c1m1['val_loss']
  else:
      model_c1m1.save_weights(os.path.join(dir_path, "model_c1m1.weights.h5"))
      df_c1m1 = pd.DataFrame({'train_loss': train_loss_c1m1,
                      'val_loss': val_loss_c1m1})
      df_c1m1.to_csv(os.path.join(dir_path, "casestudy1_model1_df.csv"))

6.4.2.5 Compute predictions

Show code
def compute_predictions(model, feature_names, X_test, y_test):
    predictions = {str(feature_names[i]): X_test[:, i] for i in range(X_test.shape[1])}
    predictions = model(predictions)
    return predictions
Show code
# Generate predictions using the newly trained model to be used in the Results section of the code to compare model outputs
predictions_c1m1 = compute_predictions(trained_model_c1m1, feature_names_c1m1, X_test_c1m1, y_test_c1m1)

6.4.2.6 Visualize training and validation loss

Visualize the training and validation loss for Case Study A Model One.

Show code
def plot_training_loss(feature_names, model, X_val, y_val, train_loss, val_loss, predictions, title):

    """
    Plot the loss curve and confusion matrix
    :param: feature_names: feature names used in model creation
    :param: model: trained model
    :param: X_val: validation data
    :param: y_val: validation data labels
    :param: train_loss: training loss over epochs
    :param: val_loss: validation loss over epochs
    :param: title: title of plot
    """
    plt.rcParams.update({ # set visualization values
    "font.size":            10,
    "axes.titlesize":       10,
    "axes.labelsize":       10,
    "xtick.labelsize":      10,
    "ytick.labelsize":      10,
    "legend.fontsize":      10,
    "figure.titlesize":     10})
    color_palette = ["#55C667FF","#20A387FF","#33638DFF","#481567FF"]
    custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)
    # # set up array for predictions
    # predictions = {str(feature_names[i]): X_val[:, i] for i in range(X_val.shape[1])}

    # # get model predictions
    # predictions = model(predictions)

    fig, (ax1) = plt.subplots(1, 1, figsize=(10,5))

    ax1.plot(train_loss, label='Training Loss', marker='o', color="#33638DFF")
    ax1.plot(val_loss, label='Validation Loss', marker='o', color="#481567FF")
    ax1.set_xlabel('Epochs')
    ax1.set_ylabel('Loss')
    #ax1.set_ylim(0, 1)
    ax1.grid()
    ax1.legend()

    fig.suptitle(title)
    plt.show()
Show code
plot_training_loss(feature_names_c1m1, trained_model_c1m1, X_test_c1m1, y_test_c1m1, train_loss_c1m1, val_loss_c1m1, predictions_c1m1,'Case Study A Model One')

6.4.3 Model Two

For the second model, we expanded on the tutorial code by including more advanced data preprocessing functions. When applying machine learning methods, it is crucial to understand the characteristics of the input data, including any inherent limitations and biases. In Model One, no effort was made to balance the classes, which can significantly impair the network’s ability to learn the key features needed to distinguish between true and false positives (Lee et al., 2016). In Model Two, we address this issue by balancing the dataset before training the model. To achieve this, we apply random undersampling to the original dataset, ensuring balance not only between classes (true and false positives) but also between satellite sources (NOAA-20 and Suomi-NPP). This helps prevent the model from learning satellite-specific noise patterns or resolution/angle biases instead of features associated with actual fire detections. Once data preprocessing was complete, model compiling and training were implemented in the same way as the previous model.

6.4.3.1 Create dataset

Show code
# Create X and y train and test data arrays
if colab_jupyter_flag == 'j':
  cs1_m2_data = os.path.join(data_dir, 'train-final-balanced.csv')
else:
  cs1_m2_data = '/content/train-final-balanced.csv'

X_train_c1m2, X_test_c1m2, y_train_c1m2, y_test_c1m2, feature_names_c1m2 = create_tr_test_data(cs1_m2_data)
print("Preview of provided data to confirm sucessful upload")
train_size_c1m2 = X_train_c1m2.shape[0]

6.4.3.2 Create and train model

6.4.3.3 Help Box - Skip model training

By setting the variable “read_in_file” to at the beginning of the code, the user activates sections of the code provided to allow the time intensive model training steps to be skipped. It is assumed that if a user is activating these sections of the code they have uploaded the appropriately named files to the same directory that the input data is stored in. File names are set by default to call the sample files that have provided with this tutorial.

Show code
if read_in_file:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model2_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m2.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m2.weights.h5"
    training_output_file = "casestudy1_model2_df.csv"

  df_c1m2 = pd.read_csv(training_output_file )
  reconstructed_model = create_bnn_model(train_size_c1m2, prior, feature_names_c1m2,
                                            activation_fun = 'relu', unit_dim = 8)
  reconstructed_model.load_weights(model_file_name)
  trained_model_c1m2 = reconstructed_model
  train_loss_c1m2 = df_c1m2['train_loss']
  val_loss_c1m2 = df_c1m2['val_loss']
Show code
model_c1m2= create_bnn_model(train_size_c1m2, prior, feature_names_c1m2, activation_fun = 'relu', unit_dim = 8)
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. If this is incorrect please go back to the beginning of the code and set the variable 'read_in_file' to false")
else:
  # Train model
  trained_model_c1m2, train_loss_c1m2, val_loss_c1m2 = run_experiment(model_c1m2, 'binary_crossentropy',
    [['accuracy', tf_keras.metrics.Precision(), tf_keras.metrics.Recall()]], epochs,
                                            feature_names_c1m2, X_train_c1m2, y_train_c1m2)

6.4.3.4 Help Box - Saving model files

By default, once a model has been trained the notebook will export the model file and outputs from its training runs to files saved within the working directory. If the notebook is being run on Colab these files WILL NOT be saved once the Colab run is exited. If a user wishes to retain these files download them to a local directory. If these users does not care about these outputs, they will automatically be delated when the notebook is closed. If the user is running local on Jupyter the files will have to be deleted by hand.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. Therefor no models files will be exported as they will just be replicates of the loaded files ")
else:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model2_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m2.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m2.weights.h5"
    training_output_file = "casestudy1_model2_df.csv"

  model_c1m2.save_weights(model_file_name)
  df_c1m2 = pd.DataFrame({'train_loss': train_loss_c1m2,
                      'val_loss': val_loss_c1m2})
  df_c1m2.to_csv(training_output_file )

6.4.3.5 Compute predictions

Show code
# Generate predictions using the newly trained model to be used in the Results section of the code to compare model outputs
predictions_c1m2 = compute_predictions(trained_model_c1m2, feature_names_c1m2, X_test_c1m2, y_test_c1m2)

6.4.3.6 Visualize training and validation loss

Visualize the training and validation loss for Case Study A Model Two.

Show code
plot_training_loss(feature_names_c1m2, trained_model_c1m2, X_test_c1m2, y_test_c1m2, train_loss_c1m2, val_loss_c1m2, predictions_c1m2, 'Case Study A Model Two')

6.4.4 Model Three

For the third model, we utilized the balanced data that was used in Model Two, and expanded the model-building code beyond what was provided by the tutorial to include hyperparameter tuning into the workflow.

6.4.5 Define epochs

Number of epochs used for trianing is increased.

Show code
epochs = 100

6.4.5.1 Create and train model

Show code
def prior_tune(kernel_size, bias_size, dtype=None):
    """
    Define tunable prior distribution.
    :param kernel_size: defined by model
    :param bias_size: defined by model
    :param dtype: defined by model
    :return: prior model
    """

    n = kernel_size + bias_size

    prior_model = tf_keras.Sequential(
      [
          tfp.layers.DistributionLambda(
            apply_parameter(fun = active_prior_distribution, var_list = [n])
          )
      ]
    )
    return prior_model


def apply_parameter(fun, var_list):
    out = fun(var_list)
    return out


def distribution_param1(var_list):
    out = lambda t: tfp.distributions.MultivariateNormalDiag(
                     loc=tf.zeros(var_list[0]), scale_diag=tf.ones(var_list[0]))
    return out


def distribution_param2(var_list):
    out = lambda t: tfp.distributions.MultivariateNormalDiag(
                     loc=tf.zeros(var_list[0])*20, scale_diag=tf.ones(var_list[0])*20)
    return out


def ParameterGrid(input_dict):
    # Extract the lists corresponding to each key
    values = list(input_dict.values())

    # Use itertools.product to get all combinations of values
    combinations = itertools.product(*values)

    # Create a list of dictionaries from the combinations
    result = []
    for combo in combinations:
        result.append(dict(zip(input_dict.keys(), combo)))

    return result

6.4.5.2 Help Box - Prior distribution explanation

The prior distribution helps control how much uncertainty and flexibility the model has in its weights. The distribution_param1 defines a standard normal prior where each weight has mean 0 and standard deviation of 1. This indicates that the weights should stay near zero. While the distribution_param2 does the same, but with a standard deviation of 20, indicating that the weights could vary widely.

Show code
prior_distribution_dic = {'default': distribution_param1,
                          'stretch': distribution_param2}

BNN_activation_function_dic = {'default': 'relu',
                               'sigmoid': 'sigmoid',
                               'gelu':'gelu',
                               'ELU': 'elu',
                               'Leaky relu':'leaky_relu'}

BNN_hiddenunits_dim_dic = {'smaller': 4,
                           'default': 8,
                           'larger': 16}

# packed into one object, needed values (not keys for the parameteric grid)
param_grid = {
    "dist":list(prior_distribution_dic.values()),
    "activation":list(BNN_activation_function_dic.values()),
    # "class_id":list(class_id_threshold_dic.values()),
    "hidden_units":list(BNN_hiddenunits_dim_dic.values())
}

6.4.5.3 Help Box - Skip model training and tuning

This help box allows the user to skip both the training and the parameter tuning steps. In order to read in a saved model file, a user must pre-define structural hyperparameters of the model being inputted. The hyperparameters selected for the given sample data set may differ form the hyperparameters that would be selected for model’s trained on different input data. A comment has been added to the code indicated the line that a user must be updated to match the hyperparameters selected for the new dataset in order for the model files to load properly.

By setting the variable “read_in_file” to at the beginning of the code, the user activates sections of the code provided to allow the time intensive model training steps to be skipped. It is assumed that if a user is activating these sections of the code they have uploaded the appropriately named files to the same directory that the input data is stored in. File names are set by default to call the sample files that have provided with this tutorial.

Show code
if read_in_file:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model3_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m3.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m3.weights.h5"
    training_output_file = "casestudy1_model3_df.csv"

  df_c1m3 = pd.read_csv(training_output_file)
  # Following hyperparameters are set in accordance with optimizing the sample data.
  # If running with personal data set update these hyperparameters to match the results of the tunning for the new dataset
  reconstructed_model = create_bnn_model(train_size_c1m2, prior, feature_names_c1m2, activation_fun = 'sigmoid', unit_dim = 4)
  reconstructed_model.load_weights(model_file_name)
  trained_model_c1m3 = reconstructed_model
  train_loss_c1m3 = df_c1m3['train_loss']
  val_loss_c1m3 = df_c1m3['val_loss']

Estimated run time on CPU: 45 minutes

Show code
models_c1m3 = []
train_losses_c1m3 = []
val_losses_c1m3 = []
params_combo = []
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. If this is incorrect please go back to the beginning of the code and set the variable 'read_in_file' to false ")
else:
  grid = list(ParameterGrid(param_grid))
  total_runs = len(grid)
  for i, params in enumerate(grid, start =1): # creates all combinations of dictionary that it is given (grid search), then can iterate through the combinations
      runs_left = total_runs - i
      print(f"[{i}/{total_runs} runs] Running with parameters: {params}")


      active_prior_distribution = params["dist"]

      model = create_bnn_model(train_size_c1m2, prior_tune, feature_names_c1m2,
                           activation_fun = params['activation'],
                           unit_dim = params['hidden_units'])

      model, train_loss, val_loss = run_experiment(model, 'binary_crossentropy', [['accuracy', tf_keras.metrics.Precision(), tf_keras.metrics.Recall()]], epochs,
                                              feature_names_c1m3, X_train_c1m3, y_train_c1m3)

      models_c1m3.append(model)
      train_losses_c1m3.append(train_loss)
      val_losses_c1m3.append(val_loss)
      params_combo.append(params)
      print(f'Finished run {i}/{total_runs}. Runs remaining: {runs_left}')

6.4.5.4 Evaluate Hyperparameters

The different models generated during hyperparameter tuning will be qualitatively evaluated using a Parallel Coordinates Plot (PCP), a scatter plot comparing training loss to training validation, and a SHapley Additive exPlanations (SHAP) kernel explainer.

Show code
# This code block outputs a parallel coordinates plot for Model Three.

if read_in_file:
  print("User has selected to read in saved models instead of training and tuning novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  df = pd.DataFrame(params_combo)

  # Calculate mean train and validation losses
  df["train_loss"] = [np.mean(loss) for loss in train_losses_c1m3]  # Averaging over epochs
  df["val_loss"] = [np.mean(loss) for loss in val_losses_c1m3]

  # Drop rows with missing values
  df.dropna(inplace=True)

  # Initialize an empty dictionary to store encoding maps for each column
  encoding_maps = {}

  # Loop through each categorical column that needs to be encoded
  for col in ["dist", "activation", "hidden_units"]:
    # Create a mapping from each unique value in the column to a unique integer index
    encoding_maps[col] = {val: idx for idx, val in enumerate(df[col].unique())}

    # Replace the original string values in the DataFrame with their corresponding integer encodings
    df[col] = df[col].map(encoding_maps[col])

  fig = go.Figure(data=
      go.Parcoords(
          line = dict(color = df['val_loss'],
                   colorscale = [[0,'purple'],[0.5,'lightseagreen'],[1,'gold']],),
          dimensions = list([
              dict(range = [0,int(len(param_grid["dist"])-1)],
                label = 'Dist', values = df['dist'], tickvals = [0,1], ticktext = ['Default', 'Stretch']),
              dict(range = [0,int(len(param_grid["activation"])-1)],
                label = 'Activation', values = df['activation'], tickvals = [0,1,2,3,4], ticktext = param_grid["activation"]),
              dict(range = [0,int(len(param_grid["hidden_units"])-1)],
                label = 'Hidden Units', values = df['hidden_units'], tickvals = [0,1,2], ticktext = param_grid["hidden_units"]),
              dict(range = [0,1.6],
                label = 'Train Loss', values = df['train_loss']),
              dict(range = [0,1.6],
                label = 'Val Loss', values = df['val_loss'])
          ])
      )
  )

  fig.update_layout(
      # Font increased from 21
      font=dict(family="Arial, sans-serif", size=25),
      plot_bgcolor = 'white',
      paper_bgcolor = 'white')

  fig.show()
Show code
# This code block outputs the different training and validation losses for each trained model on a scatterplot.

if read_in_file:
  print("User has selected to read in saved models instead of training and tuning novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  # Prepare figure & axes
  fig, ax = plt.subplots(figsize=(10, 6))
  # Drop rows with missing values
  df.dropna(inplace=True)

  # select colors
  color_palette = ["#481567FF", "#33638DFF", "#20A387FF", "#55C667FF", "#DCE319FF"]

  # Plot
  sns.scatterplot(ax=ax, data=df,x="train_loss",y="val_loss",hue="activation", style="dist", size="hidden_units", sizes=(50, 300), palette= color_palette, alpha=0.8, legend=False)

  #Diagonal line
  min_loss = min(df["train_loss"].min(), df["val_loss"].min())
  max_loss = max(df["train_loss"].max(), df["val_loss"].max())
  ax.plot([min_loss, max_loss], [min_loss, max_loss], color="gray", linestyle="--", linewidth=1.5, label="validation = train")


  activation_keys    = list(BNN_activation_function_dic.keys())
  dist_keys          = list(prior_distribution_dic.keys())
  hiddenunits_keys   = list(BNN_hiddenunits_dim_dic.keys())

  # color pallete
  palette = sns.color_palette(color_palette, len(activation_keys))

  # markers for the dist_keys
  markers = ["o","X","P"][:len(dist_keys)]

  # sizes for hidden_units
  unit_vals = [BNN_hiddenunits_dim_dic[k] for k in hiddenunits_keys]
  minu, maxu = min(unit_vals), max(unit_vals)
  def map_size(u):
      return 50 + (u - minu) / (maxu - minu) * (300-50)

  size_proxies = [
      Line2D(
        [0],[0],
        color="gray",
        marker="o",
        linestyle="None",
        markersize=np.sqrt(map_size(unit_vals[i])),
        label=f"{hiddenunits_keys[i]} ({unit_vals[i]} units)"
      )
      for i in range(len(hiddenunits_keys))
  ]

  hue_proxies = [
      Patch(color=palette[i], label=activation_keys[i])
      for i in range(len(activation_keys))
  ]

  style_proxies = [
      Line2D([0],[0], color="black",  marker=markers[i], linestyle="None", markersize=10, label=dist_keys[i])
      for i in range(len(dist_keys))
  ]

  # legend of all the parameters
  ax.legend(
      handles=hue_proxies + style_proxies + size_proxies + [Line2D([0], [0], color="gray", linestyle="--", label="validation = train")],
      title="Activation / Dist / Hidden Units",
      loc = "upper left",
      bbox_to_anchor=(1.02, 1),
      borderaxespad=0
  )
  ax.set_xlabel("Training Loss", fontsize =  18)
  ax.set_ylabel("Validation Loss", fontsize =  18)
  ax.set_title("Training vs. Validation Loss\n(color=activation, shape=dist, size=hidden_units)")

  fig.subplots_adjust(right=0.75)
  plt.tight_layout()
  plt.show()
Show code
# This code block outputs a SHapley Additive exPlanations (SHAP) plot
if read_in_file:
  print("User has selected to read in saved models instead of training and tuning novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  #SHAP
  best_model_to_explain = df["val_loss"].idxmin()
  best_index = best_model_to_explain



  #BNN wrapper
  def bnn_predict_fn(model, X, feature_names, n_samples=100):
      # Ensure X is a numpy array
      X = np.array(X)
      examples = {str(feature_names[i]): X[:, i] for i in range(X.shape[1])}

      predicted = []
      for _ in range(n_samples):
        predicted.append(model(examples).numpy())
      return np.concatenate(predicted, axis=1)

  #plot color
  plt.rcParams.update({
      "font.size":        10,
      "axes.titlesize":   10,
      "axes.labelsize":   10,
      "xtick.labelsize":  10,
      "ytick.labelsize":  10,
      "legend.fontsize":  10,
      "figure.titlesize": 10,
  })
  color_palette = ["#DCE319FF", "#55C667FF","#20A387FF","#33638DFF","#481567FF"]
  custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)
  # Use a subset of X_train as background data
  background = X_train_c1m2[:50]

  # Select and review the "best" model
  best_model_to_explain = models_c1m3[best_index]

  # Pass the model itself and the feature names to the wrapper
  shap_explainer = shap.KernelExplainer(lambda x: bnn_predict_fn(best_model_to_explain, x, feature_names_c1m2), background)

  # Compute SHAP values for test data
  X_test_subset = X_test_c1m2[101:151]
  shap_values = shap_explainer.shap_values(X_test_subset, nsamples=500)
  shap_values_averaged = np.mean(shap_values, axis=2)

  # Visualize the results
  shap.summary_plot(shap_values_averaged, X_test_subset, feature_names=feature_names_c1m2, cmap=custom_cmap)

6.4.5.5 Model selection

The hyperparameter combination that produced the model with the lowest validation loss value was selected to be used to evaluate against the models generated from other runs.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training and tuning novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  trained_model_c1m3 = models_c1m3[df["val_loss"].idxmin()]
  train_loss_c1m3 = train_losses_c1m3[df["val_loss"].idxmin()]
  val_loss_c1m3 = val_losses_c1m3[df["val_loss"].idxmin()]

6.4.5.6 Help Box - Saving model files

By default, once a model has been trained the notebook will export the model file and outputs from its training runs to files saved within the working directory. If the notebook is being run on Colab these files WILL NOT be saved once the Colab run is exited. If a user wishes to retain these files download them to a local directory. If these users does not care about these outputs, they will automatically be delated when the notebook is closed. If the user is running local on jupyter the files will have to be deleted by hand.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. Therefor no models files will be exported as they will just be replicates of the loaded files ")
else:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy1_model3_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c1m3.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c1m3.weights.h5"
    training_output_file = "casestudy1_model3_df.csv"

  trained_model_c1m3.save_weights(model_file_name)
  df_c1m3 = pd.DataFrame({'train_loss': train_loss_c1m2,
                      'val_loss': val_loss_c1m2})
  df_c1m3.to_csv(training_output_file )

6.4.5.7 Compute predictions

Show code
# Generate predictions using the newly trained model to be used in the Results section of the code to compare model outputs
predictions_c1m3 = compute_predictions(trained_model_c1m3, feature_names_c1m2, X_test_c1m2, y_test_c1m2)

6.4.5.8 Visualize training and validation loss

Visualize the training and validation loss for Case Study A Model Three.

Show code
plot_training_loss(feature_names_c1m2, trained_model_c1m3, X_test_c1m2, y_test_c1m2, train_loss_c1m3, val_loss_c1m3, predictions_c1m3, 'Case Study A Model Three')

6.4.6 Results (All Models)

Class label threshold values were optimized for each model. This was done by generating class labels using a range of threshold values from 0.25 - 0.75. Threshold value vs performance metrics were plotted on a muti-line plot, and the threshold value that maximized the most metrics for a model was selected. An accuracy assessment was implemented for every model, producing: accuracy, precision, recall, and F1 scores for each set of labeled predictions. Using these optimized threshold values, confusion matrices were generated and finalized accuracy, precision and recall scores were reported. A precision recall curve was also generated for each model, so the choice of epoch for model training could be qualitatively evaluated.

Show code
def analyze_threshold_value(df_all_stats):
# Plot all in one figure
  fig, ax = plt.subplots(figsize=(10, 5))
  palette = {
    'accuracy': '#33638DFF',
    'precision': '#481567FF',
    'recall':  '#20A387FF',
    'f1':  '#DCE319FF'
  }
  sns.set_style("whitegrid")
  sns.set_context("paper", font_scale=1.2)

  sns.lineplot(data=pd.melt(df_all_stats, id_vars=['threshold','model'], value_vars=['accuracy','precision', 'recall', 'f1']),
              x='threshold', y='value', hue='variable', style='model', palette = palette)



# Titles & labels
  plt.title("Precision, Recall, F1 across Thresholds for All Models")
  plt.xlabel("Threshold", fontsize = 18)
  plt.ylabel("Score", fontsize = 18)
  ax.grid(True)

# legend
  ax.legend(title="Metric / Model",loc='upper left', bbox_to_anchor=(1.02, 1),borderaxespad=0)

  fig.tight_layout(rect=[0, 0, 0.8, 1])

  plt.grid(True)
  plt.show()
Show code
# @title
def compute_results_grid(feature_names, model, predictions, y_val, title):
  """
  Compute the classifications and statistics (precision, recall, f1) for the trained model
  :param: feature_names: feature names used in model creation
  :param: model: trained model
  :param: X_val: validation data
  :param: y_val: validation data labels
  :param: title: title of plot
  :return: dataframe of classifications and statistics at each threshold
  """

  # NOTE: This is necessary to avoid issues with RAM, for full results please remove these subsetting rows
  #predictions = predictions[0:1000]
  #y_val = y_val[0:1000]

  # create precision/recall curve plots
  thresholds = [0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75]

  df_classifications = pd.DataFrame()
  df_stats = pd.DataFrame(columns=['threshold', 'accuracy','precision', 'recall', 'f1'])

  for thresh in thresholds:
      # predicted, TP, TN, FP, FN = compute_metrics(y_val, predictions, thresh)
      prediction_label = np.where(predictions >= thresh, 1, 0)
      df_group = pd.DataFrame()

      df_group['label'] = y_val[:]
      # df_group['predicted'] = predicted[:]
      df_group['score'] = predictions[:]
      df_group['threshold'] = thresh

      accuracy = accuracy_score(y_val, prediction_label)
      #print(accuracy)
      precision = precision_score(y_val, prediction_label)
      #print(precision)
      recall = recall_score(y_val, prediction_label)
      #print(recall)
      try:
        f1 = (2 * float(precision) * float(recall)) / (float(precision) + float(recall))
      except ZeroDivisionError:
        f1 = 0

      df_stats.loc[len(df_stats)] = [thresh, accuracy, precision, recall, f1]
  df_stats['model'] = title
  return df_stats
Show code
# Combine all stats into one DataFrame to plot all of them
stats_c1m1 = compute_results_grid(feature_names_c1m1, trained_model_c1m1, predictions_c1m1, y_test_c1m1, 'CaseAModel1')
stats_c1m2 = compute_results_grid(feature_names_c1m2, trained_model_c1m2, predictions_c1m2, y_test_c1m2, 'CaseAModel2')
stats_c1m3 = compute_results_grid(feature_names_c1m2, trained_model_c1m3, predictions_c1m3, y_test_c1m2, 'CaseAModel3')
df_all_stats = pd.concat([stats_c1m1, stats_c1m2, stats_c1m3])
analyze_threshold_value(df_all_stats)
Show code
def plot_training_losses_all(runs):
    """
    Plot the loss curve for all three models in one figure.
    :param: feature_names: feature names used in model creation
    :param: model: trained model
    :param: X_val: validation data
    :param: y_val: validation data labels
    :param: threshold: threshold for classification
    :param: train_loss: training loss over epochs
    :param: val_loss: validation loss over epochs
    :param: title: title of plot
    """
    color_palette = ["#55C667FF","#20A387FF","#33638DFF","#481567FF"]
    _ = LinearSegmentedColormap.from_list("custom_gradient", color_palette)

    fig, axes = plt.subplots(1, 3, figsize=(18, 5))
    for ax, run in zip(axes, runs):
        train_loss = run["train_loss"]
        val_loss   = run["val_loss"]
        title      = run["title"]

        ax.plot(train_loss, label='Training Loss', marker='o', linewidth=1.5, color="#33638DFF")
        ax.plot(val_loss,   label='Validation Loss', marker='o', linewidth=1.5, color="#481567FF")
        ax.set_xlabel('Epochs', fontsize = 18)
        ax.set_ylabel('Loss', fontsize = 18)
        ax.set_title(title, fontsize=18)
        ax.grid(True, alpha=0.3)
        ax.legend(loc='best')

    fig.tight_layout()
    plt.show()
Show code
runs = [
    {"train_loss": train_loss_c1m1, "val_loss": val_loss_c1m1, "title": "Case Study A Model One"},
    {"train_loss": train_loss_c1m2, "val_loss": val_loss_c1m2, "title": "Case Study A Model Two"},
    {"train_loss": train_loss_c1m3, "val_loss": val_loss_c1m3, "title": "Case Study A Model Three"},
]

plot_training_losses_all(runs)
Show code
def plot_confusion_matrix(models_data):
  """
  Plot the confusion matrix for all three models.
  :param: feature_names: feature names used in model creation
  :param: model: trained model
  :param: X_val: validation data
  :param: y_val: validation data labels
  :param: threshold: threshold for classification
  :param: train_loss: training loss over epochs
  :param: val_loss: validation loss over epochs
  :param: title: title of plot
  """
  color_palette = ["#55C667FF","#20A387FF","#33638DFF","#481567FF"]
  custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)

  fig, axes = plt.subplots(1, 3, figsize=(15, 5))

  for ax, m in zip(axes, models_data):
      y_true = np.asarray(m["y_true"]).astype(int).reshape(-1)
      preds  = m["predictions"]
      if hasattr(preds, "numpy"):  # TF tensor -> numpy
          preds = preds.numpy()
      preds = np.asarray(preds).reshape(-1)

      thr = float(m["threshold"])
      y_pred = (preds >= thr).astype(int)

      cm = confusion_matrix(y_true, y_pred)
      cm_norm = cm / np.sum(cm) if np.sum(cm) > 0 else cm

      sns.heatmap(cm_norm, annot=True, fmt='.2%',  annot_kws={"size": 18}, cmap=custom_cmap, cbar=False, ax=ax)
      ax.set_title(
          f"{m['title']}\nBNN "
          f"Accuracy:{accuracy_score(y_true, y_pred):.3f}, "
          f"Precision:{precision_score(y_true, y_pred, zero_division=0):.3f}, "
          f"Recall:{recall_score(y_true, y_pred, zero_division=0):.3f}",
          fontsize = 12
      )
      ax.set_xlabel('Predicted Label', fontsize = 20)
      ax.set_ylabel('True Label', fontsize = 20)
      ax.set_xticks([0.5, 1.5]); ax.set_xticklabels(['0', '1'])
      ax.set_yticks([0.5, 1.5]); ax.set_yticklabels(['0', '1'])
      ax.tick_params(axis='both', which='major', labelsize=20)
  fig.tight_layout()
  plt.show()

#### Define Thresholds Model 1 has maintained high accuracy, precision, recall and f1 values across all of the thresholds; while Model 2 experiences an increase in precision and accuracy at 0.45, while Model 3 experiences an increase at about 0.5. Model One was unaffected by threshold so a default value of 0.5 was used.

Show code
thresh_c1m1 = 0.5
thresh_c1m2 = 0.5
thresh_c1m3 = 0.45

models_data = [{"y_true": y_test_c1m1, "predictions": predictions_c1m1, "threshold": thresh_c1m1, "title": "Case Study A Model One"},
    {"y_true": y_test_c1m2, "predictions": predictions_c1m2, "threshold": thresh_c1m2, "title": "Case Study A Model Two"},
    {"y_true": y_test_c1m2, "predictions": predictions_c1m3, "threshold": thresh_c1m3, "title": "Case Study A Model Three"},]

plot_confusion_matrix(models_data)

6.5 Case Study B

Case Study B: Assigning multi-source EO active fire confidence using a probabilistic BNN.

This study uses a probabilistic BNN architecture that outputs a full probability distribution for each detection, not just a single point estimate. The resulting output includes a quantified measure of uncertainty (the standard deviation), which allows for the calculation of a confidence interval around the predicted probability. Furthermore, this case study integrates data from multiple Earth Observation sources (MODIS and VIIRS) to assess the impact of sensor differences on classification uncertainty.

Goal: The goal of Case Study B is to evaluate the null hypothesis that the uncertainty in the classification of an observed hotspot does not differ significantly in regard to the observation source. Case Study B analyzes fire hotspots detected by the VIIRS 375 m active fire product from the Suomi NPP and NOAA-20 satellites, alongside hotspots from the MODIS active fire product onboard the Aqua and Terra satellites. The data preparation methods were consistent with that of Case Study A, with the primary difference being the inclusion of standard-quality MODIS hotspots in addition to those from VIIRS. Note that instead of plotting results individually after training each model (loss curves, confusion matrices, etc.) , we present all results at the end of the case study in the “Results” section.

6.5.1 Model One

Model One uses the same architecture and hyperparameter tuning steps as Case Study A Model Three, but was trained with the updated data to contain both VIIRS and MODIS.

6.5.1.1 Create dataset

Show code
# Create X and y train and test data arrays
if colab_jupyter_flag == 'j':
  cs2_m1_data = os.path.join(data_dir, 'train-final-balanced-case-study-2.csv')
else:
  cs2_m1_data = 'train-final-balanced-case-study-2.csv'

X_train_c2m1, X_test_c2m1, y_train_c2m1, y_test_c2m1, feature_names_c2m1 = create_tr_test_data(cs2_m1_data)
train_size_c2m1 = X_train_c2m1.shape[0]

6.5.1.2 Create and train model

6.5.1.3 Help Box - Skip model training and tuning

Additional note if using your own input data. This help box allows the user to skip both the training and the parameter tuning steps. In order to read in a saved model file, a user must pre-define structural hyperparameters of the model being inputted. The hyperparameters selected for the given sample data set may differ form the hyperparameters that would be selected for model’s trained on different input data. A comment has been added to the code indicated the line that a user must be updated to match the hyperparameters selected for the new dataset in order for the model files to load properly.

By setting the variable “read_in_file” to at the beginning of the code, the user activates sections of the code provided to allow the time intensive model training steps to be skipped. It is assumed that if a user is activating these sections of the code they have uploaded the appropriately named files to the same directory that the input data is stored in. File names are set by default to call the sample files that have provided with this tutorial.

Show code
if read_in_file:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy2_model1_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c2m1.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c2m1.weights.h5"
    training_output_file = "casestudy2_model1_df.csv"

  df_c2m1 = pd.read_csv(training_output_file)
  # Following hyperparameters are set in accordance with optimizing the sample data.
  # If running with personal data set update these hyperparameters to match the results of the tunning for the new dataset
  reconstructed_model = create_bnn_model(train_size_c2m1, prior, feature_names_c2m1, activation_fun = 'sigmoid', unit_dim = 4)
  reconstructed_model.load_weights(model_file_name)
  trained_model_c2m1 = reconstructed_model
  train_loss_c2m1 = df_c2m1['train_loss']
  val_loss_c2m1 = df_c2m1['val_loss']
Show code
if read_in_file:
  print("User has selected to read in saved models instead of training anf tu novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  models_c2m1 = []
  train_losses_c2m1 = []
  val_losses_c2m1 = []
  params_combo_c2m1 = []
  print(param_grid.keys())
  # SHouldnt need this because it is already set
  # epochs = 100
  grid = list(ParameterGrid(param_grid))
  total_runs = len(grid)
  for i, params in enumerate(grid, start =1): # creates all combinations of dictionary that it is given (grid search), then can iterate through the combinations
    runs_left = total_runs - i
    print(f"[{i}/{total_runs} runs] Running with parameters: {params}")


    active_prior_distribution = params["dist"]

    model = create_bnn_model(train_size_c2m1, prior_tune, feature_names_c2m1,
                           activation_fun = params['activation'],
                           unit_dim = params['hidden_units'])

    model, train_loss, val_loss = run_experiment(model, 'binary_crossentropy', [['accuracy', tf_keras.metrics.Precision(), tf_keras.metrics.Recall()]], epochs,
                                              feature_names_c2m1, X_train_c2m1, y_train_c2m1)

    models_c2m1.append(model)
    train_losses_c2m1.append(train_loss)
    val_losses_c2m1.append(val_loss)
    params_combo_c2m1.append(params)
    print(f'Finished run {i}/{total_runs}. Runs remaining: {runs_left}')
Show code
if read_in_file:
  print("User has selected to read in saved models instead of training and tuning novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  df = pd.DataFrame(params_combo_c2m1)

  # Calculate mean train and validation losses
  df["train_loss"] = [np.mean(loss) for loss in train_losses_c2m1]  # Averaging over epochs
  df["val_loss"] = [np.mean(loss) for loss in val_losses_c2m1]

  # Drop rows with missing values
  df.dropna(inplace=True)
  trained_model_c2m1 = models_c2m1[df["val_loss"].idxmin()]
  train_loss_c2m1 = train_losses_c2m1[df["val_loss"].idxmin()]
  val_loss_c2m1 = val_losses_c2m1[df["val_loss"].idxmin()]
  tuned_params = params_combo_c2m1[df["val_loss"].idxmin()]
  print(tuned_params)

6.5.1.4 Help Box - Saving model files

By default, once a model has been trained the notebook will export the model file and outputs from its training runs to files saved within the working directory. If the notebook is being run on Colab these files WILL NOT be saved once the Colab run is exited. If a user wishes to retain these files download them to a local directory. If these users does not care about these outputs, they will automatically be delated when the notebook is closed. If the user is running local on Jupyter the files will have to be deleted by hand.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. Therefor no models files will be exported as they will just be replicates of the loaded files ")
else:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy2_model1_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c2m1.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c2m1.weights.h5"
    training_output_file = "casestudy2_model1_df.csv"

  trained_model_c2m1.save_weights(model_file_name)
  df_c2m1 = pd.DataFrame({'train_loss': train_loss_c2m1,
                      'val_loss': val_loss_c2m1})
  df_c2m1.to_csv(training_output_file )

6.5.1.5 Compute predictions

Show code
predictions_c2m1 = compute_predictions(trained_model_c2m1, feature_names_c2m1, X_test_c2m1, y_test_c2m1)

6.5.1.6 Visualize training and validation loss

Visualize the training and validation loss for Case Study B Model One.

Show code
plot_training_loss(feature_names_c2m1, trained_model_c2m1, X_test_c2m1, y_test_c2m1, train_loss_c2m1, val_loss_c2m1, predictions_c2m1,'Case Study B Model One')

6.5.2 Model Two

Model Two introduces a new architecture, a probabilistic Bayesian neural network (BNN), which outputs predictive distributions rather than single point estimates. From these distributions, we extract the mean as the predicted value and the standard deviation as a measure of model uncertainty. These predictions were grouped based on the source of the original observation (MODIS or VIIRS), and a Student’s t-test was performed to compare the mean standard deviation between groups. A statistically significant result would suggest that the model’s uncertainty differs depending on the data source. To further explore this, we generated histograms of the standard deviations for each group to visualize their distributions.

6.5.2.1 Create and train model

Show code
def create_probablistic_bnn_model(train_size, feature_names):
    """
    Create Bayesian neural network model
    :param train_size: number of samples in training dataset
    :param feature_names: list of input feature names
    :return: compiled Keras model
    """

    inputs = create_model_inputs(feature_names)

    features = tf_keras.layers.concatenate(list(inputs.values()))
    features = tf_keras.layers.BatchNormalization()(features)
    hidden_units = [8, 8]

    for units in hidden_units:
        features = tfp.layers.DenseVariational(
            units=units,
            make_prior_fn=prior,
            make_posterior_fn=posterior,
            kl_weight=1 / train_size,
            activation="relu",
        )(features)

    logits = tf_keras.layers.Dense(units=1)(features)
    outputs = tfp.layers.IndependentBernoulli(event_shape=1)(logits)

    model = tf_keras.Model(inputs=inputs, outputs=outputs)

    return model

6.5.2.2 Help Box - Negative Log-Likelihood Explanation

The output of our model is a probability distribution, not a point estimate. In order to understand how likely it is that our model explains the “true data,” we use the negative log-likelihood as our loss function.

Show code
def negative_loglikelihood(targets, estimated_distribution):
    return -estimated_distribution.log_prob(targets)

6.5.2.3 Help Box - Skip model training

By setting the variable “read_in_file” to at the begining of the code, the user activates sections of the code provided to allow the time intensive model training steps to be skipped. It is assumed that if a user is activating these sections of the code they have uploaded the approrpiately named files to the same directory that the input data is stored in. File names are set by default to call the sample files that have provided with this tutorial.

Show code
if read_in_file:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy2_model2_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c2m2.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c2m2.weights.h5"
    training_output_file = "casestudy2_model2_df.csv"

  df_c2m2 = pd.read_csv(training_output_file)
  reconstructed_model = create_probablistic_bnn_model(train_size_c2m1, feature_names_c2m1)
  reconstructed_model.load_weights(model_file_name)
  trained_model_c2m2 = reconstructed_model
  train_loss_c2m2 = df_c2m2['train_loss']
  val_loss_c2m2 = df_c2m2['val_loss']
Show code
model_c2m2 = create_probablistic_bnn_model(train_size_c2m1, feature_names_c2m1)
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. If this is incorrect please go back to the begining of the code and set the variable 'read_in_file' to false ")
else:
  trained_model_c2m2, train_loss_c2m2, val_loss_c2m2 = run_experiment(model=model_c2m2,
                                                  loss=negative_loglikelihood,
                                                  metrics=[['accuracy', tf_keras.metrics.Precision(), tf_keras.metrics.Recall()]],
                                                  epochs=epochs, feature_names=feature_names_c2m1,
                                                  X_train=X_train_c2m1, y_train=y_train_c2m1)

6.5.2.4 Help Box - Saving model files

By default, once a model has been trained the notebook will export the model file and outputs from its training runs to files saved within the working directory. If the notebook is being run on Colab these files WILL NOT be saved once the Colab run is exited. If a user wishes to retain these files download them to a local directory. If these users does not care about these outputs, they will automatically be delated when the notebook is closed. If the user is running local on jupyter the files will have to be deleted by hand.

Show code
if read_in_file:
  print("User has selected to read in saved models instead of training a novel one. Therefor no models files will be exported as they will just be replicates of the loaded files ")
else:
  if colab_jupyter_flag == 'j':
    # Path to files if running local using jupiter
    training_output_file = os.path.join(data_dir, "casestudy2_model2_df.csv")
    model_file_name  = os.path.join(data_dir, "model_c2m2.weights.h5")
  else:
    # Path to files if running in colab
    model_file_name = "model_c2m2.weights.h5"
    training_output_file = "casestudy2_model2_df.csv"

  trained_model_c2m2.save_weights(model_file_name)
  df_c2m2 = pd.DataFrame({'train_loss': train_loss_c2m2,
                      'val_loss': val_loss_c2m2})
  df_c2m2.to_csv(training_output_file)

6.5.2.5 Compute predictions

Show code
def compute_probablistic_predictions(model, feature_names, X_test, y_test):
    predictions = {str(feature_names[i]): X_test[:, i] for i in range(X_test.shape[1])}
    prediction_distribution = model(predictions)
    prediction_mean = prediction_distribution.mean().numpy().tolist()
    prediction_stdv = prediction_distribution.stddev().numpy()
    return prediction_mean, prediction_stdv
Show code
pred_mean_c2m2, pred_stdv_c2m2 = compute_probablistic_predictions(trained_model_c2m2, feature_names_c2m1, X_test_c2m1, y_test_c2m1)
predictions_c2m2 = compute_predictions(trained_model_c2m2, feature_names_c2m1, X_test_c2m1, y_test_c2m1)

6.5.2.6 Visualize training and validation loss

Visualize the training and validation loss for Case Study B Model Two.

Show code
plot_training_loss(feature_names_c2m1, trained_model_c2m2, X_test_c2m1, y_test_c2m1, train_loss_c2m2, val_loss_c2m2, predictions_c2m2,'Case Study B Model Two')

6.5.3 Results (All Models)

Show code
# Note this function only uses the first 1000 examples due to RAM limitations of Google Colab
stats_c2m1 = compute_results_grid(feature_names_c2m1, trained_model_c2m1, predictions_c2m1, y_test_c2m1, 'Case2Model1')
stats_c2m2  = compute_results_grid(feature_names_c2m1, trained_model_c2m2, predictions_c2m2, y_test_c2m1, 'Case2Model2')

# Combine all stats into one DataFrame
df_case2_stats = pd.concat([stats_c1m3, stats_c2m1, stats_c2m2])

analyze_threshold_value(df_case2_stats)

#### Define Thresholds A threshold of 0.5 was selected for both Model 1 and Model 2 as both Models see a noticeable change with a threshold of 0.5.

Show code
def plot_results(feature_names, model, X_val, y_val, threshold, train_loss, val_loss, predictions, title):

    """
    Plot the loss curve and confusion matrix
    :param: feature_names: feature names used in model creation
    :param: model: trained model
    :param: X_val: validation data
    :param: y_val: validation data labels
    :param: threshold: threshold for classification
    :param: train_loss: training loss over epochs
    :param: val_loss: validation loss over epochs
    :param: title: title of plot
    """
    plt.rcParams.update({ # set visualization values
    "font.size":            15,
    "axes.titlesize":       12,
    "axes.labelsize":       15,
    "xtick.labelsize":      12,
    "ytick.labelsize":      12,
    "legend.fontsize":      10,
    "figure.titlesize":     15})
    color_palette = ["#55C667FF","#20A387FF","#33638DFF","#481567FF"]
    custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)
    # # set up array for predictions
    # predictions = {str(feature_names[i]): X_val[:, i] for i in range(X_val.shape[1])}

    # # get model predictions
    # predictions = model(predictions)

    # plot the results for a defined threshold
    predicted_classes = (predictions.numpy() >= threshold).astype(int)

    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,5))

    ax1.plot(train_loss, label='Training Loss', marker='o', color="#33638DFF")
    ax1.plot(val_loss, label='Validation Loss', marker='o', color="#481567FF")
    ax1.set_title('Training and Validation Loss Over Epochs')
    ax1.set_xlabel('Epochs', fontsize = 15)
    ax1.set_ylabel('Loss', fontsize = 15)
    #ax1.set_ylim(0, 1)
    ax1.grid()
    ax1.legend()

    cm = confusion_matrix(y_val, predicted_classes)
    sns.heatmap(cm / np.sum(cm), annot=True, annot_kws={"size": 16}, fmt='.2%', cmap=custom_cmap, cbar=False)
    ax2.set_title('BNN \nAccuracy:{0:.3f}, Precision:{1:.3f}, Recall:{2:.3f} '.format(accuracy_score(y_val, predicted_classes), precision_score(y_val, predicted_classes), recall_score(y_val, predicted_classes)),
                  fontsize = 12)
    ax2.set_xlabel('Predicted Label')
    ax2.set_ylabel('True Label')

    fig.suptitle(title)
    plt.show()
Show code
def plot_probablistic_results(feature_names, model, X_test, y_test, threshold, train_loss, val_loss, predictions, title):
  """
  Plot the loss curve and confusion matrix
  :param: feature_names: feature names used in model creation
  :param: model: trained model
  :param: X_test: test data
  :param: y_test: test data labels
  :param: threshold: threshold for classification
  :param: train_loss: training loss over epochs
  :param: val_loss: validation loss over epochs
  :param: title: title of plot
  """
  plt.rcParams.update({ # set visualization values
    "font.size":            15,
    "axes.titlesize":       12,
    "axes.labelsize":       15,
    "xtick.labelsize":      12,
    "ytick.labelsize":      12,
    "legend.fontsize":      10,
    "figure.titlesize":     15})
  # # set up array for predictions
  # predictions = {str(feature_names[i]): X_val[:, i] for i in range(X_val.shape[1])}

  # # get model predictions
  # predictions = model(predictions)

  # plot the results for a defined threshold
  color_palette = ["#55C667FF","#20A387FF","#33638DFF","#481567FF"]
  custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)

  predicted_classes = (np.asarray(predictions) >= threshold).astype(int)

  fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10,5))

  ax1.plot(train_loss, label='Training Loss', marker='o', color="#33638DFF")
  ax1.plot(val_loss, label='Validation Loss', marker='o', color="#481567FF")
  ax1.set_title('Training and Validation Loss Over Epochs')
  ax1.set_xlabel('Epochs', fontsize = 15)
  ax1.set_ylabel('Loss', fontsize = 15)
  ax1.grid()
  ax1.legend(fontsize=10)

  cm = confusion_matrix(y_test, predicted_classes)
  sns.heatmap(cm / np.sum(cm), annot=True, annot_kws={"size": 16}, fmt='.2%', cmap=custom_cmap, cbar=False)
  ax2.set_title('BNN \nAccuracy:{0:.3f}, Precision:{1:.3f}, Recall:{2:.3f} '.format(accuracy_score(y_test, predicted_classes), precision_score(y_test, predicted_classes), recall_score(y_test, predicted_classes)),
                fontsize = 12)
  ax2.set_xlabel('Predicted Label')
  ax2.set_ylabel('True Label')

  fig.suptitle(title)
  plt.show()
Show code
thresh_c2m1 = 0.50
thresh_c2m2 = 0.50
plot_results(feature_names_c2m1, trained_model_c2m1, X_test_c2m1, y_test_c2m1, thresh_c2m1, train_loss_c2m1, val_loss_c2m1, predictions_c2m1,'Case Two Model One')
plot_probablistic_results(feature_names_c2m1, trained_model_c2m2, X_test_c2m1, y_test_c2m1, thresh_c2m2, train_loss_c2m2, val_loss_c2m2, pred_mean_c2m2,'Case Two Model Two')
Show code
def predicted_class_confidence_intervals(prediction_mean, prediction_stdv, y_test):
    # The 95% CI is computed as mean ± (1.96 * stdv)
    upper = (prediction_mean + (1.96 * prediction_stdv))
    lower = (prediction_mean - (1.96 * prediction_stdv))
    diff = (upper - lower).round(4).tolist()
    prediction_stdv = prediction_stdv.tolist()
    upper = upper.round(4).tolist()
    lower = lower.round(4).tolist()

    prediction_df = pd.DataFrame({'actual': y_test,
                                'prediction_mean':  [element for innerList in prediction_mean for element in innerList],
                                'upper_bound_95_C1': [element for innerList in upper for element in innerList],
                                'lower_bound_95_CI': [element for innerList in lower for element in innerList],
                                'interval_size': [element for innerList in diff for element in innerList],
                                'std_dev': [element for innerList in prediction_stdv for element in innerList]}
                                  )
    return prediction_df
Show code
pred_df_c2m2 = predicted_class_confidence_intervals(pred_mean_c2m2, pred_stdv_c2m2, y_test_c2m1)
print(pred_df_c2m2.head())
if colab_jupyter_flag == 'j':
  pred_df_c2m2.to_csv(os.path.join(dir_path, "pred_df_c2m2.csv"))
Show code
def compare_source_confidence_intervals(feature_names, X_test, prediction_df, comparison_feature):
    # Extract source labels from input data
    feature_names = feature_names.to_numpy()
    id = np.where(feature_names=='source')[0]
    source_label= X_test[:, id].tolist()
    flatList = [element for innerList in source_label for element in innerList]
    source_df = pd.DataFrame({'source': flatList})
    # Merge it to the confidence interval df
    merged_df = pd.concat([source_df, prediction_df], axis=1)
    # Perform an independent t test to determine if there is a statistically significant difference regarding confidence interval size based on source
    source_1 = merged_df[merged_df['source']== 1.0]
    source_0 = merged_df[merged_df['source']== 0.0]
    t_stat, p_value = ttest_ind(source_1[comparison_feature], source_0[comparison_feature])
    return t_stat, p_value, source_1, source_0

6.5.4 Hypothesis review

Show code
t_stat, p_value, source_1, source_0 = compare_source_confidence_intervals(feature_names_c2m1, X_test_c2m1, pred_df_c2m2, 'std_dev')
#print(f"p value: {round(p_value,8)}")
print(f"p value: {p_value:.8f}")

The p-value < 0.05, we can reject the null hypothesis and accept that the EO source does impact the standard deviation of the prediction confidence.

Show code
def feature_histograms(dist1,dist2, title):
  plt.hist(dist1, bins=20, color='#33638DFF', edgecolor='k', alpha=0.8, label ='source 1')
  plt.hist(dist2, bins=20, color='#481567FF', edgecolor='k', alpha=0.8, label = 'source 0')
  plt.legend(loc='upper left')
  plt.title(title)
  plt.show()
Show code
feature_histograms(source_1['std_dev'], source_0['std_dev'], 'Std Dev')
feature_histograms(source_1['prediction_mean'], source_0['prediction_mean'], 'Prediction')
print(source_1.head())
if colab_jupyter_flag == 'j':
  source_1.to_csv(os.path.join(dir_path, "source_1.csv"))
  source_0.to_csv(os.path.join(dir_path, "source_0.csv"))
Show code
def probablistic_feature_importance(trained_model, X_train, X_test, feature_names):
    def bnn_predict_fn(model, X, feature_names, n_samples=100):
        # Ensure X is a numpy array
        X = np.array(X)
        examples = {str(feature_names[i]): X[:, i] for i in range(X.shape[1])}

        predicted = []
        for _ in range(n_samples):
            predicted.append(model(examples).mean().numpy())
        return np.concatenate(predicted, axis=1)

    #plot color
    color_palette = ["#DCE319FF", "#55C667FF","#20A387FF","#33638DFF","#481567FF"]
    custom_cmap = LinearSegmentedColormap.from_list("custom_gradient", color_palette)
    # Use a subset of X_train as background data
    background = X_train[:50]

    # Pass the model itself and the feature names to the wrapper
    shap_explainer = shap.KernelExplainer(lambda x: bnn_predict_fn(trained_model, x, feature_names), background)

    # Compute SHAP values for test data
    X_test_subset = X_test[101:151]
    shap_values = shap_explainer.shap_values(X_test_subset, nsamples=500)

    # Visualize the results
    shap_values_averaged = np.mean(shap_values, axis=2)
    shap.summary_plot(shap_values_averaged, X_test_subset, feature_names=feature_names, cmap=custom_cmap)
Show code
probablistic_feature_importance(trained_model_c2m2, X_train_c2m1, X_test_c2m1, feature_names_c2m1)