HydroSat Systems

Training repo for the ITU Ingenuity Cup: AI and Space Computing Challenge preliminary round submission by HydroSat Systems.

This repository is focused on Track 2: Space IntelligenceS Promoting Water Quality. The preliminary-round task is binary pixel-level water extraction from remote-sensing imagery. The competition metrics are:

mIoU
Kappa

Team

Team name: HydroSat Systems
Team leader: Arv Bali
Team members:
- Arv Bali
- Mrinank Sivakumar

Final Winning Result

Model family: SegFormer-B5
Initialization: ADE20K pretrained backbone
Winning ensemble members (kept in the shared Google Drive bundle, not committed to Git):
- work_dirs/segformer_b5_train__prelim_water_fresh_ade_seed3407/best_mIoU_iter_20000.pth
- work_dirs/segformer_b5_train__prelim_water_fresh_ade_seed6143/best_mIoU_iter_16000.pth
Winning inference recipe:
- TTA scales: 0.75, 1.0, 1.25
- TTA flips: none, horizontal, vertical, both
- threshold: 0.45
- min connected-component size: 0
- fill small holes up to area 512
Best local validation result:
- mIoU = 0.889628192968432
- Kappa = 0.8801566605387882

Locked references:

artifacts/final_selection/segformer_current_champion.json
artifacts/tuning/20260328_wave2_tta12_ms075_100_125/best.json

Final submission artifact:

submission/segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512/

That folder contains the checked-in competition-named ZIP plus the final water/ mask folder. The submission ZIP is flat and contains only the 216 PNG mask files required by Zero2X.

What Is Versioned

This repo intentionally keeps the minimum Git-tracked assets needed to retrain and document the winning result:

all winning-workflow code, including the MMseg configs, in hydrosat/
winning tuning and selection metadata in artifacts/final_selection/ and artifacts/tuning/
final mask folder and final ZIP in submission/

This repo intentionally does not keep large or derived artifacts that are better shared outside Git:

raw competition dataset in dataset-preliminary round/
pretrained and winning checkpoints under checkpoints/ and work_dirs/
local environments such as openmmlab_env/
normalized dataset copy under artifacts/datasets/
probability dumps under artifacts/predictions/
verbose logs
stale checkpoints
older draft submissions

The files shared outside Git are stored in this Google Drive folder. Alongside the shared Google Drive bundle with the saved checkpoints, a plain Git clone supports both a full retraining from the raw dataset and a faster inference-only recreation of the winning submission.

Repository Layout

hydrosat/
  cli/                      # Python entrypoints
  configs/                  # mmseg dataset + model configs for the winning SegFormer run
  core/                     # shared helpers, metrics, mask ops
  tools/                    # environment bootstrap and validation helpers
requirements/               # pinned environment notes for OpenMMLab on Windows
dataset-preliminary round/  # optional local / shared-Drive full raw dataset, not versioned
checkpoints/                # optional local / shared-Drive checkpoint bundle, not versioned
work_dirs/                  # optional local / shared-Drive winning checkpoints, not versioned
artifacts/
submission/
tests/

Windows Install

This project targets native Windows with NVIDIA CUDA and OpenMMLab.

Prerequisites

Python 3.10
Git
NVIDIA GPU with CUDA available to nvidia-smi
CUDA Toolkit 13.0
Visual Studio Build Tools 2022 with the C++ workload

Bootstrap The Environment

Run the bootstrap helper from the repo root:

powershell -ExecutionPolicy Bypass -File .\hydrosat\tools\bootstrap_openmmlab_env.ps1

This script:

creates openmmlab_env/
installs the CUDA-enabled PyTorch stack
builds the Windows MMCV ops setup we use
installs mmengine, mmsegmentation, and mmdetection
installs this repo in editable mode as hydrosat

hydrosat.cli.train_model will automatically download the ADE20K SegFormer initialization checkpoint if checkpoints/segformer_mit_b5_ade20k.pth is missing.

Validate the environment:

.\openmmlab_env\Scripts\python.exe .\hydrosat\tools\check_openmmlab_env.py --require-cuda

The bootstrap installs the repo in editable mode with the small development extras needed for the local regression tests.

Run the lightweight regression tests:

.\openmmlab_env\Scripts\python.exe -m pytest

Quick Reproduction With Shared Drive Checkpoints

This path is not available from the Git clone alone.

Before running the commands below, copy the shared Google Drive checkpoint bundle into the repo so these files exist locally:

dataset-preliminary round/Test/Images
dataset-preliminary round/Train/Images
dataset-preliminary round/Train/Masks
dataset-preliminary round/Val/Images
dataset-preliminary round/Val/Masks
work_dirs/segformer_b5_train__prelim_water_fresh_ade_seed3407/best_mIoU_iter_20000.pth
work_dirs/segformer_b5_train__prelim_water_fresh_ade_seed6143/best_mIoU_iter_16000.pth

1. Prepare the normalized binary dataset

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.prepare_preliminary_round_dataset `
  --raw-root ".\\dataset-preliminary round" `
  --output-root .\artifacts\datasets\preliminary_round_water

2. Export test probabilities for each winning checkpoint

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed3407\best_mIoU_iter_20000.pth `
  --split test `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --tta `
  --tta-scales 0.75 1.0 1.25 `
  --tta-flips none horizontal vertical both `
  --save-probs `
  --output-dir .\artifacts\predictions\segformer_fresh_seed3407_test_tta12_ms075_100_125

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed6143\best_mIoU_iter_16000.pth `
  --split test `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --tta `
  --tta-scales 0.75 1.0 1.25 `
  --tta-flips none horizontal vertical both `
  --save-probs `
  --output-dir .\artifacts\predictions\segformer_fresh_seed6143_test_tta12_ms075_100_125

3. Export the final binary masks from the ensemble

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.export_ensemble_water_masks `
  --probs-dir .\artifacts\predictions\segformer_fresh_seed3407_test_tta12_ms075_100_125\probs `
  --probs-dir .\artifacts\predictions\segformer_fresh_seed6143_test_tta12_ms075_100_125\probs `
  --threshold 0.45 `
  --fill-hole-area 512 `
  --output-root .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512

4. Package the final flat Zero2X ZIP

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.package_submission `
  --pred-dir .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512\water `
  --team-name "<team-name>" `
  --leader-name "<team-leader>" `
  --email "<leader-email>" `
  --phone "<leader-phone>" `
  --output-dir .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512

Retrain The Winning Model From Scratch

This path is not available from the Git clone alone.

Before running the commands below, copy the shared Google Drive checkpoint bundle into the repo so these files exist locally:

dataset-preliminary round/Test/Images
dataset-preliminary round/Train/Images
dataset-preliminary round/Train/Masks
dataset-preliminary round/Val/Images
dataset-preliminary round/Val/Masks

1. Prepare the normalized binary dataset

Use the same dataset prep command from the previous section.

2. Train the two winning seeds

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.train_model `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --run-name prelim_water_fresh_ade_seed3407 `
  --seed 3407 `
  --max-iters 20000 `
  --val-interval 2000 `
  --device cuda

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.train_model `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --run-name prelim_water_fresh_ade_seed6143 `
  --seed 6143 `
  --max-iters 20000 `
  --val-interval 2000 `
  --device cuda

3. Recreate the winning validation comparison

Run validation TTA exports for the two winning checkpoints:

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed3407\best_mIoU_iter_20000.pth `
  --split val `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --tta `
  --tta-scales 0.75 1.0 1.25 `
  --tta-flips none horizontal vertical both `
  --save-probs `
  --output-dir .\artifacts\predictions\segformer_fresh_seed3407_val_tta12_ms075_100_125

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
  --model segformer `
  --config .\hydrosat\configs\segformer_water_binary_train.py `
  --checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed6143\best_mIoU_iter_16000.pth `
  --split val `
  --data-root .\artifacts\datasets\preliminary_round_water `
  --tta `
  --tta-scales 0.75 1.0 1.25 `
  --tta-flips none horizontal vertical both `
  --save-probs `
  --output-dir .\artifacts\predictions\segformer_fresh_seed6143_val_tta12_ms075_100_125

Tune the winning ensemble recipe:

.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.tune_segformer_ensemble `
  --probs-dir .\artifacts\predictions\segformer_fresh_seed3407_val_tta12_ms075_100_125\probs `
  --probs-dir .\artifacts\predictions\segformer_fresh_seed6143_val_tta12_ms075_100_125\probs `
  --mask-dir .\artifacts\datasets\preliminary_round_water\val\masks `
  --output-dir .\artifacts\tuning\20260328_wave2_tta12_ms075_100_125 `
  --threshold-start 0.45 `
  --threshold-stop 0.55 `
  --threshold-step 0.005 `
  --min-component-sizes 0 512 1024 1536 `
  --fill-hole-areas 0 128 256 512

Dataset Layout

The raw dataset is expected to keep this layout:

dataset-preliminary round/
  Train/
    Images/
    Masks/
  Val/
    Images/
    Masks/
  Test/
    Images/

The normalized working layout generated by hydrosat.cli.prepare_preliminary_round_dataset is:

artifacts/datasets/preliminary_round_water/
  train/
    images/
    masks/
  val/
    images/
    masks/
  test/
    images/
    masks/
  splits/
    train.txt
    val.txt
    test.txt

Notes

This repo is intentionally light because the raw dataset and large checkpoints are shared outside Git through Google Drive.
The README does not publish the leader phone number or email; those details belong only in the competition ZIP filename.
The authoritative winning references are artifacts/final_selection/segformer_current_champion.json and artifacts/tuning/20260328_wave2_tta12_ms075_100_125/best.json.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
artifacts		artifacts
hydrosat		hydrosat
requirements		requirements
submission/segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512		submission/segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HydroSat Systems

Team

Final Winning Result

What Is Versioned

Repository Layout

Windows Install

Prerequisites

Bootstrap The Environment

Quick Reproduction With Shared Drive Checkpoints

1. Prepare the normalized binary dataset

2. Export test probabilities for each winning checkpoint

3. Export the final binary masks from the ensemble

4. Package the final flat Zero2X ZIP

Retrain The Winning Model From Scratch

1. Prepare the normalized binary dataset

2. Train the two winning seeds

3. Recreate the winning validation comparison

Dataset Layout

Notes

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HydroSat Systems

Team

Final Winning Result

What Is Versioned

Repository Layout

Windows Install

Prerequisites

Bootstrap The Environment

Quick Reproduction With Shared Drive Checkpoints

1. Prepare the normalized binary dataset

2. Export test probabilities for each winning checkpoint

3. Export the final binary masks from the ensemble

4. Package the final flat Zero2X ZIP

Retrain The Winning Model From Scratch

1. Prepare the normalized binary dataset

2. Train the two winning seeds

3. Recreate the winning validation comparison

Dataset Layout

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages