| Metric | Score |
|---|---|
| Architecture | DeepLabV3+ with ResNet-101 encoder |
| Best mIoU (val) | 61.93% |
| Pixel Accuracy (val) | 86.10% |
| Best Epoch | 77 / 80 (trained 80 of 150 configured) |
| Image Size | 768 × 768 |
| Class | IoU (%) |
|---|---|
| Trees | 85.51 |
| Lush Bushes | 67.93 |
| Dry Grass | 69.04 |
| Dry Bushes | 48.39 |
| Ground Clutter | 39.83 |
| Flowers | 46.91 |
| Logs | 47.16 |
| Rocks | 49.93 |
| Landscape | 66.13 |
| Sky | 98.48 |
Final Submission/
├── README.md # This file
├── requirements.txt # Python dependencies (pip)
├── environment.yml # Conda environment spec
├── configs/
│ ├── deeplab_boost.yaml # Best performing config (used for final model)
│ ├── deeplab_config.yaml # Baseline DeepLabV3+ config
│ ├── base_config.yaml # Base config template
│ └── unet_config.yaml # Alternate U-Net config
├── src/
│ ├── train.py # Training script
│ ├── test.py # Testing & evaluation script
│ ├── model.py # Model architecture builder
│ ├── dataset.py # Dataset & augmentation pipeline
│ ├── losses.py # Combined loss functions (CE + Dice + Focal)
│ ├── metrics.py # IoU & pixel accuracy metrics
│ ├── utils.py # Utilities (checkpointing, logging, scheduler)
│ ├── inference.py # Single-image inference script
│ └── __init__.py
├── data/
│ └── class_mapping.json # Class ID → index mapping with colors
├── checkpoints/
│ └── best.pth # Best model weights (epoch 77, 61.93% mIoU)
└── outputs/
├── iou_scores.txt # Detailed per-class IoU scores
├── confusion_matrix.png # Normalized confusion matrix
├── loss_curve.png # Training/validation loss over epochs
├── miou_curve.png # mIoU progression over epochs
├── predictions/ # Test set predictions (segmentation masks)
└── visualizations/ # Side-by-side visual comparisons
python -m venv venv
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate
pip install -r requirements.txtconda env create -f environment.yml
conda activate segmentation- Python 3.10+
- CUDA-capable GPU (6GB+ VRAM recommended)
- PyTorch 2.0+ with CUDA support
Download the competition dataset and place it so the directory structure is:
data/
├── train/
│ ├── Color_Images/ # Training RGB images
│ └── Segmentation/ # Training ground-truth masks
├── val/
│ ├── Color_Images/ # Validation RGB images
│ └── Segmentation/ # Validation ground-truth masks
├── test/
│ └── Colour_Images/ # Test RGB images (no masks)
└── class_mapping.json
python src/train.py --config configs/deeplab_boost.yamlTo resume from a checkpoint:
python src/train.py --config configs/deeplab_boost.yaml --resume checkpoints/best.pthpython src/test.py --config configs/deeplab_boost.yaml --checkpoint checkpoints/best.pth --eval_valpython src/test.py --config configs/deeplab_boost.yaml --checkpoint checkpoints/best.pthPredictions are saved to outputs/predictions/.
Test-Time Augmentation (TTA) — multi-scale + flip averaging for ~0.5% mIoU boost:
python src/test.py --config configs/deeplab_boost.yaml --checkpoint checkpoints/best.pth --eval_val --ttaCRF Post-Processing — edge-aware refinement for sharper boundaries:
python src/test.py --config configs/deeplab_boost.yaml --checkpoint checkpoints/best.pth --eval_val --crf- Evaluation: Prints mIoU, pixel accuracy, and per-class IoU to console. Saves scores to
outputs/iou_scores.txtand confusion matrix tooutputs/confusion_matrix.png. - Test inference: Saves predicted segmentation masks as PNG files in
outputs/predictions/and visual overlays inoutputs/visualizations/.
| Parameter | Value | Purpose |
|---|---|---|
| Image size | 768×768 | Larger crops capture more context |
| Batch size | 3 (×4 accum = eff 12) | Fits in 6GB VRAM with grad accumulation |
| Encoder | ResNet-101 (ImageNet pretrained) | Strong feature extractor |
| Optimizer | AdamW (lr=3e-4, wd=0.01) | Stable convergence |
| Scheduler | OneCycleLR (cos anneal) | Warm-up + aggressive LR decay |
| Loss | CE(0.4) + Dice(0.4) + Focal(0.2) | Balanced per-pixel + per-class learning |
| Augmentations | Flip, rotate(45°), scale(0.4), elastic, color jitter, random crop | Robustness to desert variation |
| Mixed precision | Enabled | 2× faster training, lower VRAM |
| Early stopping | 20 epochs patience | Prevents overfitting |