This repository contains the full implementation of our vision-based navigation and intersection detection project using Webots digital twins, along with real-world deployment on the Crazyflie nano-drone.
The project leverages deep learning models, simulation environments, and automated data collection to explore robust navigation strategies.
Beyond simulation, we deployed the navigation pipeline on a Crazyflie nano-drone.
The workflow consists of:
- Streaming camera frames from the onboard Crazyflie camera.
- Real-time processing on an external computer using the trained CNN models.
- Automated control commands sent back to the drone for navigation.
- The system successfully detects intersections and navigates autonomously in a test environment.
- Data is logged and categorized into the standard left_right_forward folder format for further analysis.
For more details, please visit the Webots page.
🗂️ Dataset is available at this page.
To understand and visualize the decision-making of our CNN models, we used Grad-CAM (Gradient-weighted Class Activation Mapping).
Grad-CAM highlights regions of the input image that contribute most to the network’s predictions, providing insight into which visual features the network focuses on during navigation and intersection detection.
-
ResNet50
High-capacity model used for detailed feature extraction. Grad-CAM analysis shows strong attention on lane markings and intersection cues. -
MobileNetV2
Lightweight model suitable for real-time deployment on resource-limited hardware. Grad-CAM highlights key road features while maintaining efficiency.
| Left Grad-CAM | Forward Grad-CAM | Right Grad-CAM |
|---|---|---|
![]() |
![]() |
![]() |
🔹 Test Parameters on EgoCart Dataset(ImageNet Model).
| Metric | Our Test Data | EgoCart Data |
|---|---|---|
| Hamming Loss | 0.08 | 0.15 |
| Precision | 0.88 | 0.89 |
| Recall | 0.97 | 0.83 |
| F1-score | 0.92 | 0.86 |
| Parameter | ResNet50 | MobileNetV2 | PULP-DroNetv3(Quantized) |
|---|---|---|---|
| Flops | 4109470720 | 312917056 | 50445696 |
| Param size(MB) | 94.06 | 8.91 | 1.28 |
| Forward/backward pass size (MB): | 4173.73 | 1709.60 | 136.48 |
| Parameter | ResNet50(FP-32) | ResNet50-Static Quantized(Int-8) |
|---|---|---|
| Hamming Loss | 0.2500 | 0.2533(+1.32%) |
| Precision | 0.6667 | 0.6633(-0.51%) |
| Recall | 0.9362 | 0.9362 |
| F1-score | 0.7788 | 0.7765(-0.3%) |
| Model_size(MB) | 94.06 | 24.1(-74.4%) |
- webots/ – All the scripts used in webots experiments.
- real_world/ – All the scripts used in real world experiments.
- gradCAM/ – Contains Grad-CAM script which was used to get results for ResNet50 and MobileNetV2.
- test_script/ – Scripts of the unseen data validation.
- Videos/ – Simulation and real-world demo videos.
- Trained Models - To access all the trained models. Please visit this page.
- Supplemantary Materials - To access, please visit this page.
⭐ If you find this project useful, please give it a star!
- Grad-CAM results are crucial for interpreting model behavior and validating that the network focuses on meaningful road and intersection features.
- Real-world experiments demonstrate that the simulation-to-reality pipeline works reliably, bridging the gap between Webots digital twins and physical hardware.
🎥 More simulation and real-world demo videos can be found in the
Videos/folder.







