2 Crop Mapping
2.1 Rice mapping in Bhutan with U-Net using high resolution satellite imagery
2.2 Note: This notebook is meant to be run in Colab. You can still run this locally, make sure you have Google Drive API installed and path adjusted in relevant places.
This notebook is also available in this github repo: https://github.com/SERVIR/servir-aces. Navigate to the notebook folder.
2.3 Setup environment
2.3.1 Download datasets
For this chapter, we have already prepared and exported the training datasets. They can be found at the google cloud storage and we will use gsutil to get the dataset in our workspace. The dataset has training, testing, and validation subdirectory. Let’s start by downloading these datasets in our workspace.
If you’re looking to produce your own datasets, you can follow this notebook which was used to produce these training, testing, and validation datasets provided in this notebook.
2.3.2 Setup config file variables
Now the repo is downloaded. We will create an environment file file to place point to our training data and customize parameters for the model. To do this, we make a copy of the .env.example file provided.
Under the hood, all the configuration provided via the environment file are parsed as a config object and can be accessed programatically.
Note current version does not expose all the model intracacies through the environment file but future version may include those depending on the need.
Okay, now we have the config.env file, we will use this to provide our environments and parameters.
Note there are several parameters that can be changed. Let’s start by changing the BASEDIR and OUTPUT_DIR as below.
BASEDIR = "/content/"
OUTPUT_DIR = "/content/drive/MyDrive/Colab Notebooks/DL_Book/Chapter_1/output"
We will start by training a U-Net model using the dl-book/chapter-1/unet_256x256_planet_wo_indices dataset inside the dataset folder for this exercise. Let’s go ahead and change our DATADIR in the config.env file as below.
DATADIR = "datasets/unet_256x256_planet_wo_indices"
These datasets have RGBN from Planetscope mosiac. Since we are trying to map the rice fields, we use growing season and pre-growing season information. Thus, we have 8 optical bands, namely red_before, green_before, blue_before, nir_before, red_during, green_during, blue_during, and nir_during. In adidition, you can use USE_ELEVATION and USE_S1 config to include the topographic and radar information. Since this datasets have toppgraphic and radar features, so we won’t be settting these config values. Similarly, these datasets are tiled to 256x256 pixels, so let’s also change that.
# For model training, USE_ELEVATION extends FEATURES with "elevation" & "slope"
# USE_S1 extends FEATURES with "vv_asc_before", "vh_asc_before", "vv_asc_during", "vh_asc_during",
# "vv_desc_before", "vh_desc_before", "vv_desc_during", "vh_desc_during"
# In case these are not useful and you have other bands in your training data, you can do set
# USE_ELEVATION and USE_S1 to False and update FEATURES to include needed bands
USE_ELEVATION = False
USE_S1 = False
PATCH_SHAPE = (256, 256)
Next, we need to calculate the size of the traiing, testing and validation dataset. For this, we know our size before hand. But aces also provides handful of functions that we can use to calculate this. See this notebook to learn more about how to do it. We will also change the BATCH_SIZE to 32; if you have larger memory available, you can increase the BATCH_SIZE. You can run for longer EPOCHS by changing the EPOCHS paramter; we will keep it to 5 for now.
# Sizes of the training and evaluation datasets.
TRAIN_SIZE = 8531
TEST_SIZE = 1222
VAL_SIZE = 2404
BATCH_SIZE = 32
EPOCHS = 30
2.3.3 Update the config file programtically
We can also make a dictionary so we can change these config settings programatically.
Show code
BASEDIR = "/content/" # @param {type:"string"}
OUTPUT_DIR = "/content/drive/MyDrive/Colab Notebooks/DL_Book/Chapter_1/output" # @param {type:"string"}
DATADIR = "datasets/unet_256x256_planet_wo_indices" # @param {type:"string"}
# PATCH_SHAPE, USE_ELEVATION, USE_S1, TRAIN_SIZE, TEST_SIZE, VAL_SIZE
# BATCH_SIZE, EPOCHS are converted to their appropriate type.
USE_ELEVATION = "False" # @param {type:"string"}
USE_S1 = "False" # @param {type:"string"}
PATCH_SHAPE = "(256, 256)" # @param {type:"string"}
TRAIN_SIZE = "8531" # @param {type:"string"}
TEST_SIZE = "1222" # @param {type:"string"}
VAL_SIZE = "2404" # @param {type:"string"}
BATCH_SIZE = "32" # @param {type:"string"}
EPOCHS = "30" # @param {type:"string"}
MODEL_DIR_NAME = "unet_v1" # @param {type:"string"}Show code
unet_config_settings = {
"BASEDIR" : BASEDIR,
"OUTPUT_DIR": OUTPUT_DIR,
"DATADIR": DATADIR,
"USE_ELEVATION": USE_ELEVATION,
"USE_S1": USE_S1,
"PATCH_SHAPE": PATCH_SHAPE,
"TRAIN_SIZE": TRAIN_SIZE,
"TEST_SIZE": TEST_SIZE,
"VAL_SIZE": VAL_SIZE,
"BATCH_SIZE": BATCH_SIZE,
"EPOCHS": EPOCHS,
"MODEL_DIR_NAME": MODEL_DIR_NAME,
}2.4 U-Net Model
2.4.1 Load config file variables
Let’s load our config file through the Config class.
Most of the config in the config.env is now available via the config instance. Let’s check few of them here.
2.4.2 Load ModelTrainer class
Next, let’s make an instance of the ModelTrainer object. The ModelTrainer class provides various tools for training, buidling, compiling, and running specified deep learning models.
2.4.3 Train and Save U-Net model
ModelTrainer class provides various functionality. We will use train_model function that helps to train the model using the provided configuration settings.
This method performs the following steps: - Configures memory growth for TensorFlow. - Creates TensorFlow datasets for training, testing, and validation. - Builds and compiles the model. - Prepares the output directory for saving models and results. - Starts the training process. - Evaluates and prints validation metrics. - Saves training parameters, plots, and models.
2.4.4 Save the config file
Show code
from pathlib import Path
import shutil
config_file = Path(config_file)
drive_config_file = Path(unet_config.MODEL_DIR / f"{str(config_file).split('/')[-1]}")
# Create the target directory if it doesn't exist
drive_config_file.parent.mkdir(parents=True, exist_ok=True)
# Copy the file
shutil.copy(config_file, drive_config_file)
print(f"File copied from {config_file} to {drive_config_file}")2.4.5 Load the logs files via TensorBoard
Tensorboard provides a unique way to view and interact with the logs while the model is being trained. Learn more here. Here we only show you how you can load them to tensorboard with our training logs.
2.4.6 Load the Saved U-Net Model
Load the saved model
2.4.7 Inference using Saved U-Net Model
Now we can use the saved model to start the export of the prediction of the image. For prediction, you would need to first prepare your image data. We have already exported the image needed here, which we will use for now. See this notebook to understand how we did it.
In addition, this notebook shows how you can then use the image to predict from the saved Model.
In any case, you now have the prediction in the Earth Engine as image.
2.5 DNN Model
2.5.1 Setup any changes in the config file for DNN Model
There are few config variables that needs to be changed for running a DNN model. First would be the data itself so let’s change the DATADIR. We also need to change our output directory using MODEL_DIR_NAME. This is the sub-directory inside the OUTPUT_DIR for this model run. We also need to specify this is the DNN model that we want to run. We have MODEL_TYPE parameter for that. Currently, it supports unet, dnn, and cnn (case sensitive) models; default being unet. Make other changes, as appropriate.
DATADIR = "datasets/dnn_planet_wo_indices"
MODEL_DIR_NAME = "dnn_v1"
MODEL_TYPE = "dnn"
2.5.2 Update the config file programtically
Show code
DATADIR = "datasets/dnn_planet_wo_indices" # @param {type:"string"}
# PATCH_SHAPE, USE_ELEVATION, USE_S1, TRAIN_SIZE, TEST_SIZE, VAL_SIZE
# BATCH_SIZE, EPOCHS are converted to their appropriate type.
MODEL_DIR_NAME = "dnn_v1" # @param {type:"string"}
MODEL_TYPE = "dnn" # @param {type:"string"}
BATCH_SIZE = "32" # @param {type:"string"}
EPOCHS = "30" # @param {type:"string"}2.5.3 Load config file variables for DNN Model
Most of the config in the config.env is now available via the config instance. Let’s check few of them here.
2.5.4 Load ModelTrainer class
Next, let’s make an instance of the ModelTrainer object. The ModelTrainer class provides various tools for training, buidling, compiling, and running specified deep learning models.
2.5.5 Train and Save DNN model
2.5.6 Save the config file
Show code
drive_config_file = Path(dnn_config.MODEL_DIR / f"{str(config_file).split('/')[-1]}")
# Create the target directory if it doesn't exist
drive_config_file.parent.mkdir(parents=True, exist_ok=True)
# Copy the file
shutil.copy(config_file, drive_config_file)
print(f"File copied from {config_file} to {drive_config_file}")2.5.7 Load the logs files via TensorBoard
2.5.8 Load the Saved DNN Model
2.5.9 Inference using Saved DNN Model
Now we can use the saved model to start the export of the prediction of the image. For prediction, you would need to first prepare your image data. We have already exported the image needed here, which we will use for now. See this notebook to understand how we did it.
In addition, this notebook shows how you can then use the image to predict from the saved Model.
In any case, you now have the prediction in the Earth Engine as image.
2.6 Independent Validation
For independent validation, we will use a file that we have prepared. These files were collected using Collect Earth Online by SCO and NASA DEVELOP interns. We will be using GEE here. Before we do that, let’s make changes in our config file.
We will make sure our GCS_PROJECT is setup correctly.
GCS_PROJECT = "servir-ee"
2.6.1 Update the config file
2.6.2 Load config file variable
2.6.3 Import earthengine and geemap for visualization
2.6.4 Class Information and Masking
Show code
# CLASS
# 0 - cropland etc.
# 1 - rice
# 2 - forest
# 3 - Built up
# 4 - Others (includes water body)
l1 = ee.FeatureCollection("projects/servir-sco-assets/assets/Bhutan/BT_Admin_1")
paro = l1.filter(ee.Filter.eq("ADM1_EN", "Paro"))
# mask the rice growing zone
# in Paro, rice grows upto 2600 m asl (double check to make sure??)
dem = ee.Image("MERIT/DEM/v1_0_3") # ee.Image('USGS/SRTMGL1_003')
dem = dem.clip(paro)
rice_zone = dem.gte(0).And(dem.lte(2600))2.6.5 Model: U-Net
2.6.5.1 Load and visualize the prediction output
Show code
UNET_RGBN = ee.Image("projects/servir-ee/assets/dl-book/chapter-1/prediction/prediction_unet_v1")
UNET_RGBN = UNET_RGBN.updateMask(rice_zone)
Map.centerObject(UNET_RGBN, 11)
Map.addLayer(UNET_RGBN.clip(paro), {"bands": ["prediction"], "min":0, "max":4, "palette": ["FFFF00", "FFC0CB", "267300", "E60000", "005CE6"]}, "UNET_RGBN")
Map2.6.5.2 Calculate classification metrics
Remapping to rice and non-rice output
Show code
Show code
error_matrix_unet = prediction_unet.errorMatrix(actual="rice", predicted="remapped")
test_acc_unet = error_matrix_unet.accuracy()
test_kappa_unet = error_matrix_unet.kappa()
test_recall_producer_acc_unet = error_matrix_unet.producersAccuracy().get([1, 0])
test_precision_consumer_acc_unet = error_matrix_unet.consumersAccuracy().get([0, 1])
f1_unet = error_matrix_unet.fscore().get([1])Show code
print("error_matrix_unet", error_matrix_unet.getInfo())
print("test_acc_unet", test_acc_unet.getInfo())
print("test_kappa_unet", test_kappa_unet.getInfo())
print("test_recall_producer_acc_unet", test_recall_producer_acc_unet.getInfo())
print("test_precision_consumer_acc_unet", test_precision_consumer_acc_unet.getInfo())
print("f1_unet", f1_unet.getInfo())2.6.5.3 Calculate Probability Distribution
Show code
prob_output_unet = UNET_RGBN.select(["prediction", "others_etc", "cropland_etc", "urban", "forest", "rice"]) \
.rename(["prediction_class", "others_prob", "cropland_prob", "urban_prob", "forest_prob", "rice_prob"]) \
.sampleRegions(collection=ceo_final_data, scale=10, geometries=True)
# print("prob_output_unet", prob_output_unet.getInfo())2.6.6 Model: DNN
2.6.6.1 Load and visualize the prediction output
Show code
DNN_RGBN = ee.Image("projects/servir-ee/assets/dl-book/chapter-1/prediction/prediction_dnn_v1")
DNN_RGBN = DNN_RGBN.updateMask(rice_zone)
Map.centerObject(DNN_RGBN)
Map.addLayer(DNN_RGBN.clip(paro), {"bands": ["prediction"], "min":0, "max":4, "palette": ["FFFF00", "FFC0CB", "267300", "E60000", "005CE6"]}, "DNN_RGBN")
Map2.6.6.2 Calculate classification metrics
Show code
error_matrix_dnn = prediction_dnn.errorMatrix(actual="rice", predicted="remapped")
test_acc_dnn = error_matrix_dnn.accuracy()
test_kappa_dnn = error_matrix_dnn.kappa()
test_recall_producer_acc_dnn = error_matrix_dnn.producersAccuracy().get([1, 0])
test_precision_consumer_acc_dnn = error_matrix_dnn.consumersAccuracy().get([0, 1])
f1_dnn = error_matrix_dnn.fscore().get([1])Show code
print("error_matrix_dnn", error_matrix_dnn.getInfo())
print("test_acc_dnn", test_acc_dnn.getInfo())
print("test_kappa_dnn", test_kappa_dnn.getInfo())
print("test_recall_producer_acc_dnn", test_recall_producer_acc_dnn.getInfo())
print("test_precision_consumer_acc_dnn", test_precision_consumer_acc_dnn.getInfo())
print("f1_dnn", f1_dnn.getInfo())2.6.6.3 Calculate Probability Distribution
Show code
prob_output_dnn = DNN_RGBN.select(["prediction", "others_etc", "cropland_etc", "urban", "forest", "rice"]) \
.rename(["prediction_class", "others_prob", "cropland_prob", "urban_prob", "forest_prob", "rice_prob"]) \
.sampleRegions(collection=ceo_final_data, scale=10, geometries=True)
# print("prob_output_dnn", prob_output_dnn.getInfo())2.7 Figures and Plots
2.7.1 Training and Validation Plot
Show code
# Create subplots for different metrics in a 3x4 grid
fig, axs = plt.subplots(2, 4, figsize=(4*7, 6*2))
colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728"]
metrics = ["loss", "precision", "recall", "categorical_accuracy"]
metrics_name = ["Loss", "Precision", "Recall", "Categorical Accuracy"]
epochs = range(1, config.EPOCHS + 1)
title_fontsize = 22
label_fontsize = 22
legend_fontsize = 15
tick_fontsize = 18
lw=1.5
for i in range(2):
for y in range(len(metrics)):
if i == 1:
axs[i][y].plot(epochs, unet_model_metrics[f"val_{metrics[y]}"], color=colors[0], marker="o", lw=lw, label=f"U-Net validate {metrics[y]}")
axs[i][y].plot(epochs, dnn_model_metrics[f"val_{metrics[y]}"], color=colors[1], lw=lw, marker="o", label=f"DNN validate {metrics[y]}")
axs[i][y].set_title(f"Validate {metrics_name[y]}", fontsize=title_fontsize)
axs[i][y].set_xlabel("epochs", fontsize=label_fontsize)
axs[i][y].set_ylabel(f"{metrics[y]}", fontsize=label_fontsize)
axs[i][y].grid(linestyle="dotted", alpha=0.7)
axs[i][y].legend(fontsize=legend_fontsize)
axs[i][y].tick_params(axis="both", which="major", labelsize=tick_fontsize)
else:
axs[i][y].plot(epochs, unet_model_metrics[metrics[y]], color=colors[0], lw=lw, marker="o", label=f"U-Net train {metrics[y]}")
axs[i][y].plot(epochs, dnn_model_metrics[metrics[y]], color=colors[1], lw=lw, marker="o", label=f"DNN train {metrics[y]}")
axs[i][y].set_title(f"Train {metrics_name[y]}", fontsize=title_fontsize)
axs[i][y].set_xlabel("epochs", fontsize=label_fontsize)
axs[i][y].set_ylabel(f"{metrics[y]}", fontsize=label_fontsize)
axs[i][y].grid(linestyle="dotted", alpha=0.7)
axs[i][y].legend(fontsize=legend_fontsize)
axs[i][y].tick_params(axis="both", which="major", labelsize=tick_fontsize)
# Adjust layout and show the plot
plt.tight_layout()
# plt.savefig("metrics_plot_model_comparison.png", dpi=500, bbox_inches="tight")
plt.show()Show code
# Create subplots for different metrics in a 3x4 grid
fig, axs = plt.subplots(1, 4, figsize=(4*7, 6*1))
colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728"]
metrics = ["loss", "precision", "recall", "categorical_accuracy"]
metrics_name = ["Loss", "Precision", "Recall", "Categorical Accuracy"]
epochs = range(1, config.EPOCHS + 1)
title_fontsize = 22
label_fontsize = 22
legend_fontsize = 15
tick_fontsize = 18
lw=1.5
for y in range(len(metrics)):
axs[y].plot(epochs, unet_model_metrics[f"val_{metrics[y]}"], color=colors[0], marker="o", lw=lw, label=f"U-Net validate {metrics[y]}")
axs[y].plot(epochs, dnn_model_metrics[f"val_{metrics[y]}"], color=colors[1], lw=lw, marker="o", label=f"DNN validate {metrics[y]}")
axs[y].plot(epochs, unet_model_metrics[metrics[y]], color=colors[2], lw=lw, marker="o", label=f"U-Net train {metrics[y]}")
axs[y].plot(epochs, dnn_model_metrics[metrics[y]], color=colors[3], lw=lw, marker="o", label=f"DNN train {metrics[y]}")
axs[y].set_title(f"{metrics_name[y]}", fontsize=title_fontsize)
axs[y].set_xlabel("epochs", fontsize=label_fontsize)
axs[y].set_ylabel(f"{metrics[y]}", fontsize=label_fontsize)
axs[y].grid(linestyle="dotted", alpha=0.7)
axs[y].legend(fontsize=legend_fontsize)
axs[y].tick_params(axis="both", which="major", labelsize=tick_fontsize)
# Adjust layout and show the plot
plt.tight_layout()
# plt.savefig("metrics_plot_model_comparison.png", dpi=500, bbox_inches="tight")
plt.show()2.7.2 Probability Distribution Plot
Show code
all_data = {}
unet_data = []
dnn_data = []
unet_rice_data = []
dnn_rice_data = []
unet_other_data = []
dnn_other_data = []
for i, feature in enumerate(prob_output_unet["features"]):
unet_rice_prob = round(feature["properties"]["rice_prob"], 5)
unet_other_prob = round(feature["properties"]["cropland_prob"] + round(feature["properties"]["forest_prob"] + feature["properties"]["others_prob"]+ feature["properties"]["urban_prob"]), 5)
unet_data.append([unet_rice_prob, unet_other_prob])
unet_rice_data.append(unet_rice_prob)
unet_other_data.append(unet_other_prob)
dnn_feature = prob_output_dnn["features"][i]
dnn_rice_prob = round(dnn_feature["properties"]["rice_prob"], 5)
dnn_other_prob = 1. - round(dnn_feature["properties"]["rice_prob"], 5)
# dnn_other_prob = round(dnn_feature["properties"]["cropland_prob"] + dnn_feature["properties"]["forest_prob"] + dnn_feature["properties"]["others_prob"]+ dnn_feature["properties"]["urban_prob"], 5)
dnn_data.append([dnn_rice_prob, dnn_other_prob])
dnn_rice_data.append(dnn_rice_prob)
dnn_other_data.append(dnn_other_prob)Show code
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(8, 5))
title_fontsize = 22
label_fontsize = 10
tick_fontsize = 10
# rectangular box plot
bplot1 = ax1.boxplot([unet_rice_data, dnn_rice_data],
notch=True,
vert=True, # vertical box alignment
patch_artist=True, # fill with color
labels=["U-Net", "DNN"],
sym="k+") # will be used to label x-ticks
ax1.set_title("Rice Probability")
# notch shape box plot
bplot2 = ax2.boxplot([unet_other_data, dnn_other_data],
notch=True, # notch shape
vert=True, # vertical box alignment
patch_artist=True, # fill with color
labels=["U-Net", "DNN"],
sym="k+") # will be used to label x-ticks
ax2.set_title("Other Probability")