Training repo for the ITU Ingenuity Cup: AI and Space Computing Challenge preliminary round submission by HydroSat Systems.
This repository is focused on Track 2: Space IntelligenceS Promoting Water Quality. The preliminary-round task is binary pixel-level water extraction from remote-sensing imagery. The competition metrics are:
mIoUKappa
- Team name: HydroSat Systems
- Team leader: Arv Bali
- Team members:
- Model family: SegFormer-B5
- Initialization: ADE20K pretrained backbone
- Winning ensemble members (kept in the shared Google Drive bundle, not committed to Git):
work_dirs/segformer_b5_train__prelim_water_fresh_ade_seed3407/best_mIoU_iter_20000.pthwork_dirs/segformer_b5_train__prelim_water_fresh_ade_seed6143/best_mIoU_iter_16000.pth
- Winning inference recipe:
- TTA scales:
0.75,1.0,1.25 - TTA flips:
none,horizontal,vertical,both - threshold:
0.45 - min connected-component size:
0 - fill small holes up to area
512
- TTA scales:
- Best local validation result:
mIoU = 0.889628192968432Kappa = 0.8801566605387882
Locked references:
artifacts/final_selection/segformer_current_champion.jsonartifacts/tuning/20260328_wave2_tta12_ms075_100_125/best.json
Final submission artifact:
submission/segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512/
That folder contains the checked-in competition-named ZIP plus the final water/ mask folder. The submission ZIP is flat and contains only the 216 PNG mask files required by Zero2X.
This repo intentionally keeps the minimum Git-tracked assets needed to retrain and document the winning result:
- all winning-workflow code, including the MMseg configs, in
hydrosat/ - winning tuning and selection metadata in
artifacts/final_selection/andartifacts/tuning/ - final mask folder and final ZIP in
submission/
This repo intentionally does not keep large or derived artifacts that are better shared outside Git:
- raw competition dataset in
dataset-preliminary round/ - pretrained and winning checkpoints under
checkpoints/andwork_dirs/ - local environments such as
openmmlab_env/ - normalized dataset copy under
artifacts/datasets/ - probability dumps under
artifacts/predictions/ - verbose logs
- stale checkpoints
- older draft submissions
The files shared outside Git are stored in this Google Drive folder. Alongside the shared Google Drive bundle with the saved checkpoints, a plain Git clone supports both a full retraining from the raw dataset and a faster inference-only recreation of the winning submission.
hydrosat/
cli/ # Python entrypoints
configs/ # mmseg dataset + model configs for the winning SegFormer run
core/ # shared helpers, metrics, mask ops
tools/ # environment bootstrap and validation helpers
requirements/ # pinned environment notes for OpenMMLab on Windows
dataset-preliminary round/ # optional local / shared-Drive full raw dataset, not versioned
checkpoints/ # optional local / shared-Drive checkpoint bundle, not versioned
work_dirs/ # optional local / shared-Drive winning checkpoints, not versioned
artifacts/
submission/
tests/
This project targets native Windows with NVIDIA CUDA and OpenMMLab.
- Python
3.10 - Git
- NVIDIA GPU with CUDA available to
nvidia-smi - CUDA Toolkit
13.0 - Visual Studio Build Tools 2022 with the C++ workload
Run the bootstrap helper from the repo root:
powershell -ExecutionPolicy Bypass -File .\hydrosat\tools\bootstrap_openmmlab_env.ps1This script:
- creates
openmmlab_env/ - installs the CUDA-enabled PyTorch stack
- builds the Windows MMCV ops setup we use
- installs
mmengine,mmsegmentation, andmmdetection - installs this repo in editable mode as
hydrosat
hydrosat.cli.train_model will automatically download the ADE20K SegFormer initialization checkpoint if checkpoints/segformer_mit_b5_ade20k.pth is missing.
Validate the environment:
.\openmmlab_env\Scripts\python.exe .\hydrosat\tools\check_openmmlab_env.py --require-cudaThe bootstrap installs the repo in editable mode with the small development extras needed for the local regression tests.
Run the lightweight regression tests:
.\openmmlab_env\Scripts\python.exe -m pytestThis path is not available from the Git clone alone.
Before running the commands below, copy the shared Google Drive checkpoint bundle into the repo so these files exist locally:
dataset-preliminary round/Test/Imagesdataset-preliminary round/Train/Imagesdataset-preliminary round/Train/Masksdataset-preliminary round/Val/Imagesdataset-preliminary round/Val/Maskswork_dirs/segformer_b5_train__prelim_water_fresh_ade_seed3407/best_mIoU_iter_20000.pthwork_dirs/segformer_b5_train__prelim_water_fresh_ade_seed6143/best_mIoU_iter_16000.pth
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.prepare_preliminary_round_dataset `
--raw-root ".\\dataset-preliminary round" `
--output-root .\artifacts\datasets\preliminary_round_water.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed3407\best_mIoU_iter_20000.pth `
--split test `
--data-root .\artifacts\datasets\preliminary_round_water `
--tta `
--tta-scales 0.75 1.0 1.25 `
--tta-flips none horizontal vertical both `
--save-probs `
--output-dir .\artifacts\predictions\segformer_fresh_seed3407_test_tta12_ms075_100_125
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed6143\best_mIoU_iter_16000.pth `
--split test `
--data-root .\artifacts\datasets\preliminary_round_water `
--tta `
--tta-scales 0.75 1.0 1.25 `
--tta-flips none horizontal vertical both `
--save-probs `
--output-dir .\artifacts\predictions\segformer_fresh_seed6143_test_tta12_ms075_100_125.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.export_ensemble_water_masks `
--probs-dir .\artifacts\predictions\segformer_fresh_seed3407_test_tta12_ms075_100_125\probs `
--probs-dir .\artifacts\predictions\segformer_fresh_seed6143_test_tta12_ms075_100_125\probs `
--threshold 0.45 `
--fill-hole-area 512 `
--output-root .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.package_submission `
--pred-dir .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512\water `
--team-name "<team-name>" `
--leader-name "<team-leader>" `
--email "<leader-email>" `
--phone "<leader-phone>" `
--output-dir .\submission\segformer_ensemble_3407_6143_tta12_ms075_100_125_thr045_hole512This path is not available from the Git clone alone.
Before running the commands below, copy the shared Google Drive checkpoint bundle into the repo so these files exist locally:
dataset-preliminary round/Test/Imagesdataset-preliminary round/Train/Imagesdataset-preliminary round/Train/Masksdataset-preliminary round/Val/Imagesdataset-preliminary round/Val/Masks
Use the same dataset prep command from the previous section.
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.train_model `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--data-root .\artifacts\datasets\preliminary_round_water `
--run-name prelim_water_fresh_ade_seed3407 `
--seed 3407 `
--max-iters 20000 `
--val-interval 2000 `
--device cuda
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.train_model `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--data-root .\artifacts\datasets\preliminary_round_water `
--run-name prelim_water_fresh_ade_seed6143 `
--seed 6143 `
--max-iters 20000 `
--val-interval 2000 `
--device cudaRun validation TTA exports for the two winning checkpoints:
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed3407\best_mIoU_iter_20000.pth `
--split val `
--data-root .\artifacts\datasets\preliminary_round_water `
--tta `
--tta-scales 0.75 1.0 1.25 `
--tta-flips none horizontal vertical both `
--save-probs `
--output-dir .\artifacts\predictions\segformer_fresh_seed3407_val_tta12_ms075_100_125
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.predict_probs `
--model segformer `
--config .\hydrosat\configs\segformer_water_binary_train.py `
--checkpoint .\work_dirs\segformer_b5_train__prelim_water_fresh_ade_seed6143\best_mIoU_iter_16000.pth `
--split val `
--data-root .\artifacts\datasets\preliminary_round_water `
--tta `
--tta-scales 0.75 1.0 1.25 `
--tta-flips none horizontal vertical both `
--save-probs `
--output-dir .\artifacts\predictions\segformer_fresh_seed6143_val_tta12_ms075_100_125Tune the winning ensemble recipe:
.\openmmlab_env\Scripts\python.exe -m hydrosat.cli.tune_segformer_ensemble `
--probs-dir .\artifacts\predictions\segformer_fresh_seed3407_val_tta12_ms075_100_125\probs `
--probs-dir .\artifacts\predictions\segformer_fresh_seed6143_val_tta12_ms075_100_125\probs `
--mask-dir .\artifacts\datasets\preliminary_round_water\val\masks `
--output-dir .\artifacts\tuning\20260328_wave2_tta12_ms075_100_125 `
--threshold-start 0.45 `
--threshold-stop 0.55 `
--threshold-step 0.005 `
--min-component-sizes 0 512 1024 1536 `
--fill-hole-areas 0 128 256 512The raw dataset is expected to keep this layout:
dataset-preliminary round/
Train/
Images/
Masks/
Val/
Images/
Masks/
Test/
Images/
The normalized working layout generated by hydrosat.cli.prepare_preliminary_round_dataset is:
artifacts/datasets/preliminary_round_water/
train/
images/
masks/
val/
images/
masks/
test/
images/
masks/
splits/
train.txt
val.txt
test.txt
- This repo is intentionally light because the raw dataset and large checkpoints are shared outside Git through Google Drive.
- The README does not publish the leader phone number or email; those details belong only in the competition ZIP filename.
- The authoritative winning references are
artifacts/final_selection/segformer_current_champion.jsonandartifacts/tuning/20260328_wave2_tta12_ms075_100_125/best.json.