Skip to content

japanj/pugs-detection

Repository files navigation

Public Urban Green Spaces (PUGS) Detection Workflow

DOI

This project provides a reproducible workflow for PUGS detection, using open-source technology, by integrating Sentinel-2 image and OpenStreetMap (OSM) data with a deep learning method.

The study area in this project is Dresden, Germany.

Overview of the workflow

This workflow covers all steps from data acquisition to model evaluation and result analysis. The final output is the binary mask of PUGS for the study area.

Flowchart of the workflow

The model architecture used for detecting PUGS is U-Net with ResNet-50 backbone. The model uses pre-trained weight from TorchGeo library. See Model documentation for more details about model setup.

Project Organization

├── LICENSE             <- Open-source license of the project
├── README.md           <- The top-level README for users
├── config.yaml         <- All parameters used in the workflow
├── data                <- All input datasets used in the workflow, including both raw data 
│   │                      and processed data generated during data preparation steps
│   ├── processed       <- The intermediate data from raw data processing.
│   └── raw             <- The original, immutable data dump.
│
├── models              <- Trained and serialized (saved) models, model evaluations, and
│   │                      model hyperparameters log
│   ├── checkpoints     <- Best model files and training logs for each experiment
│   └── test_result     <- Model performance metrics from the best model
│
├── notebooks           <- Jupyter notebooks are named using the following convention:
│                          `<step_number>_<short_description>.ipynb`, e.g. `1_data_acquisition_osm`.
│
├── results             <- Results from model prediction
│   ├── clipped_prediction     <- Model prediction results clipped to Dresden administrative boundary 
│   ├── fn_maps                <- False negative (FN) area maps from model prediction
│   ├── fp_maps                <- False positive (FP) area maps from model prediction 
│   └── whole_area_prediction  <- Model prediction results
│
├── reports             <- Generated analysis (e.g. HTML, PDF, LaTeX, etc.)
│   ├── figures         <- Generated graphics and figures to be used in reporting
│   └── notebook html   <- Notebooks in HTML format
│   
├── pyproject.toml      <- Project metadata
├── uv.lock             <- Lockfile that contains information about project's dependencies
│
└── pugs_detection      <- Main Python package containing modules and utilities for PUGS detection workflow

Datasets

Main input data Source Description Data License
Sentinel-2 image Copernicus Data Space Ecosystem High-resolution and multi-spectral satellite image. It includes 13 bands and spatial resolution is 10m, 20m, and 60m depending on the wavelength The data is regulated under EU law (Commission Delegated Regulation (EU) No 1159/2013) which based on a principle of full, open and free access.
More details about data policy: Sentinel-2 Data Policy
OSM data OSM Areas/polygons related to green spaces, Point of Interest (POI) such as bench, and footpath network Open Data Commons Open Database License (ODbL)
Ground truth data European Union's Copernicus Land Monitoring Service (CLMS) Provide land cover and land use data in Functional Urban Areas (FUA) (This dataset is downloaded by specifying only Dresden area) The data is regulated under EU law (Commission Delegated Regulation (EU) No 1159/2013) which based on a principle of full, open and free access.
More details about data policy: CLMS Data Policy
Ground truth data Dresden Open Data Portal Datasets related to PUGS dl-de/by-2-0
Ground truth data Author Manually digitized boundary of Park Zwirnmühle CC-BY-4.0

More details about input data can be found in Data folder documentation

Hardware and Software Specifications

Windows Subsystem for Linux (WSL) is used to develop and run the workflow.

Host Operating System (OS): Windows 11 (64-bit OS, x64-based processor)
Workflow Environment: WSL2 (Ubuntu 22.04.3 LTS)
CPU: 13th Gen Intel(R) Core(TM) i7-13700H 2.40 GHz
GPU: NVIDIA RTX A500
RAM: 32 GB
Disk storage: 1 TB

uv (Python package and project manager) is used in this project. The version of uv is 0.7.3.

Estimated project size

The project size is approximately 5.7 GB.

  • data: approximately 1.64 GB
  • models: approximately 3.27 GB
  • code and others: approximately 0.79 GB

Getting Started

  1. Clone this repository or download ZIP file of this repository. To clone the repository, use the following command:

    git clone https://github.com/japanj/pugs-detection.git
    
  2. Download data and model checkpoints of all experiments from https://doi.org/10.5281/zenodo.15596620

    Make sure that data and models are placed in the same folder structure in Project Organization

  3. Make sure that you have installed Python and pip.

  4. Install uv package by running

    pip install uv
    

    Note: pipx install uv is also recommened as it will automatically installed uv package in isolated environment. You can visit uv document to see more details about uv installation.

    To install exact version of uv, you can use the following command pip install uv==0.7.3 or pipx install uv==0.7.3.

  5. Navigate to the root directory of the cloned repository. You should be in the directory that contains pyproject.toml.

  6. Set up the virtual environment by running

    uv sync
    

    The virtual environment (.venv folder) will be automatically created in project folder.

  7. You can verify the installed packages by running

    uv pip list
    

Steps to run the workflow

All notebooks are available in notebooks folder. The number at the beginning of each notebook's name indicates the execution order. Each notebook has different requirements or dependencies which are described in the notebooks.

Note:

Results

The result of different model training experiments are shown in the table below. To see the detail of Model experiment setup, please visit Model Experiment Setup document.

Version folder input Loss function IoU (Jaccard index) Precision Recall F1 score Accuracy
version_0
  • Sentinel-2 image
Jaccard loss 0.7712 0.9056 0.8386 0.8708 0.9617
version_1
  • Sentinel-2 image
  • PUGS binary mask derived from OSM
Jaccard loss 0.7777 0.9168 0.8368 0.8750 0.9632
version_2
  • Sentinel-2 image
  • SDT raster
Jaccard loss 0.7547 0.8752 0.8458 0.8602 0.9577
version_3
  • Sentinel-2 image
BCE 0.7557 0.8842 0.8386 0.8608 0.9583
version_4
  • Sentinel-2 image
  • PUGS binary mask derived from OSM
BCE 0.7579 0.9082 0.8209 0.8623 0.9597
version_5
  • Sentinel-2 image
  • SDT raster
BCE 0.7517 0.8853 0.8328 0.8582 0.9577
version_6
  • Sentinel-2 image
Focal loss 0.7178 0.8655 0.8080 0.8358 0.9512
version_7
  • Sentinel-2 image
  • PUGS binary mask derived from OSM
Focal loss 0.7430 0.8892 0.8189 0.8526 0.9564
version_8
  • Sentinel-2 image
  • SDT raster
Focal loss 0.7236 0.8793 0.8034 0.8396 0.9528

BCE = Binary Cross Entropy

Based on the results, the best-performing model uses Sentinel-2 imagery together with a PUGS binary mask derived from OSM as an input and trained with Jaccard loss function.

Example of the results

These are examples of model prediction outputs from the best-performing model among all experiments.

result_viz_1 result_viz_2

For the full details of model prediction output, please visit results folder.

Further analysis from the model prediction

In the result analysis, the two models are selected based on the highest performance model in baseline (using only Sentinel-2 image) and having additional data from OSM to do the result analysis. These are two selected models:

Version folder Input datasets Loss function
version_0
  • Sentinel-2 image
Jaccard loss
version_1
  • Sentinel-2 image
  • PUGS binary mask derived from OSM
Jaccard loss

🔍 Key Findings

  • Model performance improves as green space size increases.
  • The model that uses both Sentinel-2 imagery and PUGS binary mask from OSM as input outperforms the one using only Sentinel-2 imagery.
  • Regional parks are easiest to detect and both models achieve approximately 99% of recall.
  • Small PUGS (e.g., pocket and neighbourhood parks) are more difficult to detect and have lower recall values.
  • Regional parks dominate total green space area but make up only a small fraction of the total number of PUGS.
  • In contrast, small PUGS has the higheest numbers in terms of PUGS count but contribute little to total area.
  • The most significant performance gap between two models gain from using additional data from OSM is seen in small PUGS, especially pocket parks.
  • Conclusion: Incorporating additional data from OSM helps improve detection of small PUGS.
Type of PUGS Size (ha)
Pocket park <= 0.4
Neighbourhood park (0.4, 3]
Community park (3, 10]
Urban park (10, 80]
Regional park > 80

'(' indicates exclusive and ']' indicates inclusive. For example, (0.4, 3] means 0.4 < x <= 3

Note: PUGS size categories adapted from (Byrne & Sipe, 2010; Choi et al., 2020; Şenik & Uzun, 2022).

result_analysis

To validate outputs, please visit Reproducibility document

License

The license information is divided into three sections:

  1. This repository is released under MIT License.
  2. The data used in this project is under various licenses. Please visit Data license section to see more detail about license of data used in this project.
  3. The model weights, prediction output, and all figures are licensed under CC-BY-4.0.

Contact

For questions or issues, please open an issue on GitHub or contact m.p.likitpanjamanon@student.utwente.nl

References

About

This project provides a reproducible workflow for PUGS detection, using open-source technology, by integrating Sentinel-2 image and OpenStreetMap (OSM) data with a deep learning method.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors