Transfer Learning for Tree Classification in Ghana

Distinguishing natural and agricultural tree systems is critical for monitoring ecosystem services, commodity-driven deforestation, and restoration progress. From a remote sensing standpoint, this task is particularly challenging because:

Certain trees exhibit high spectral similarity
Highly heterogeneous, smallholder agricultural landscapes require a small minimum mapping unit to detect differences
Regions with persistent cloud cover and atmospheric haze reduce optical image quality

The distinction is critical for restoration monitoring applications, as gains in tree cover cannot be meaningfully assessed without understanding whether they result from a successful restoration intervention or agricultural expansion.

Research Summary

This project applies a transfer learning approach to classify tree-based land use systems from satellite imagery. We leverage spatial embeddings extracted from a high-performing convolutional neural network originally trained for tree cover mapping (Brandt et al., 2023) and repurpose them for land use classification.

We train a CatBoost classifier using a combination of Sentinel-1 and Sentinel-2 imagery, gray-level co-occurrence matrix (GLCM) texture features, and extracted spatial embeddings to classify four land use classes: natural, agroforestry, monoculture, and other (background). Through comparative modeling and feature selection, we demonstrate consistent performance gains from incorporating both transfer-learned features and texture information.

In collaboration with Ghana’s Environmental Protection Agency, the method is demonstrated across 26 priority districts, resulting in a 10-meter resolution land use map for 2020. Overall, the findings suggest that spatial embeddings learned for tree detection retain meaningful information about land use structure, offering a scalable pathway for broader monitoring of natural and agricultural tree systems.

Download the paper: Technical Note
View the data: Ghana EPA Monitoring Portal (toggle on WRI Land Use)

Suggested citation:
Ertel, J., J. Brandt, R. Rognstad, and E. Glen (2025). Transfer learning to detect natural, monoculture, and agroforestry tree-based systems in Ghana using remote sensing. Technical Note. Washington, DC: World Resources Institute. doi:10.46830/writn.24.00030

Machine Learning Pipeline

The predictive features for the classification task include Sentinel-1 and Sentinel-2 imagery, spatial embeddings and texture features. Texture features were derived from Sentinel-2 imagery using a GLCM analysis method. The figure below shows the machine learning pipeline, including pre- and post-processing steps.

DVC Setup & Directory Structure

The src directory is designed to integrate with Data Version Control (DVC) to ensure reproducibility and efficient management of the project's machine learning pipeline.

This directory contains modular scripts and functions organized to support a DVC workflow. Each script performs a specific task within the pipeline, such as data preparation, model training, or evaluation. The pipeline stages are connected through DVC.

Why DVC?

Clear separation of pipeline stages.
Improved tracking for dependencies, outputs, and metadata.
Compatible with YAML-based pipeline configurations.

Directory Structure

/src
├── stage_load_data.py          # Scripts for ingesting raw data
├── stage_prep_features.py      # Scripts for data cleaning, transformation, and feature engineering
├── stage_train_model.py        # Script to train machine learning models
├── stage_select_and_tune.py    # Script to perform hyperparameter tuning and feature selection
├── stage_evaluate_model.py     # Script to evaluate model performance
└── transfer_learning.py        # Script for running inference

Using DVC Pipelines

The pipeline is defined in the dvc.yaml file, with dependencies, parameters, and outputs explicitly stated for each stage.

Parameters Management

Pipeline parameters are defined in params.yaml. This file centralizes hyperparameters and configuration options for each stage of the pipeline. Update parameters as needed, and rerun the pipeline using dvc repro to propagate changes.

Name		Name	Last commit message	Last commit date
Latest commit History 636 Commits
.dvc		.dvc
envs		envs
images		images
model		model
notebooks		notebooks
src		src
.dockerignore		.dockerignore
.dvcignore		.dvcignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.txt		LICENSE.txt
README.md		README.md
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
params.yaml		params.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Transfer Learning for Tree Classification in Ghana

Research Summary

Machine Learning Pipeline

DVC Setup & Directory Structure

Why DVC?

Directory Structure

Using DVC Pipelines

Parameters Management

Additional Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

wri/transfer-trees

Folders and files

Latest commit

History

Repository files navigation

Transfer Learning for Tree Classification in Ghana

Research Summary

Machine Learning Pipeline

DVC Setup & Directory Structure

Why DVC?

Directory Structure

Using DVC Pipelines

Parameters Management

Additional Resources

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages