Urban Maintenance Reporting and Monitoring System Utilizing Zero-Shot Learning Models and Vehicle Cameras
- Overview
- Key Features
- System Architecture
- Technology Stack
- Installation & Setup
- Usage
- API Documentation
- Performance Metrics
- Deployment
- Contributing
- License
StreetGuard is an advanced AI-powered urban maintenance monitoring system that leverages cutting-edge computer vision and zero-shot learning models to autonomously detect and report urban infrastructure issues in real-time. The system addresses critical challenges in modern urban maintenance by providing automated, data-driven insights for city officials to prioritize repairs and allocate resources efficiently.
Urban infrastructure maintenance faces persistent challenges:
- Road Safety Hazards: Potholes, road damage, and traffic obstructions
- Public Safety Issues: Illegal encampments and vandalism
- Resource Inefficiency: Traditional manual inspection methods are slow and labor-intensive
- Lack of Real-time Data: Delayed response times due to insufficient monitoring
StreetGuard introduces a novel hybrid detection pipeline combining:
- Open-Set Models: GroundingDINO and OWL-ViT for zero-shot anomaly detection
- Specialized Models: Fine-tuned YOLOv8 for high-speed, high-accuracy detection
- Real-time Processing: Vehicle-mounted camera simulation via Google Street View API
- Intelligent Analytics: LLM-powered insights and predictive maintenance recommendations
- GroundingDINO + BLIP + CrossEncoder: Zero-shot detection with semantic validation
- OWL-ViT + BLIP + CrossEncoder: Alternative open-vocabulary detection approach
- YOLOv8 Specialized Models: High-speed detection for road damage, graffiti, and encampments
- Real-time Google Maps integration with custom markers
- Dynamic heatmaps showing anomaly density across urban areas
- Live streaming simulation with multi-directional camera views
- Comprehensive analytics dashboard with charts and trend analysis
- Google Gemini LLM integration for automated report generation
- Natural language query interface for anomaly analysis
- Predictive analytics for maintenance prioritization
- Semantic understanding of urban maintenance contexts
- Real-time detection event tracking
- Historical trend analysis and reporting
- Task assignment and progress monitoring
- Performance metrics and system health monitoring

Interactive map with markers and live stream window.

Detection events with detailed anomaly information.

LLM chatbot assistant, heatmap visualization, and analytics graphs.

Task assigning, tracking, and completion workflow.
Watch a quick demo showcasing StreetGuard in action:
Note: Clicking the badge will open the demo video in Google Drive.
Our detection workflow combines multiple AI models in a hybrid pipeline to balance flexibility, precision, and speed. The process begins with GroundingDINO or OWL-ViT, which perform zero-shot anomaly detection using natural language prompts. Detected regions are then passed to BLIP, which generates captions, and a CrossEncoder validates the semantic alignment between the caption and the target label. This semantic filtering step significantly reduces false positives by discarding ambiguous detections. In parallel, specialized YOLOv8 models handle common recurring issues such as graffiti, road damage, and encampments, providing high-speed and accurate results. Together, this workflow ensures reliable detection of both known and novel hazards, enabling real-time and scalable urban maintenance monitoring:contentReference[oaicite:0]{index=0}.
Our evaluation shows that the open-set hybrid pipeline (GroundingDINO + BLIP + CrossEncoder) consistently outperforms standalone models and the YOLOv8 baseline in terms of precision and balanced effectiveness. By adding a semantic validation layer, the pipeline dramatically reduces false positivesβachieving precision of ~0.72 for tents and ~0.68 for graffitiβwhile still maintaining competitive F1 scores. Although recall drops due to stricter filtering, the overall reliability and consistency of this hybrid approach make it more practical for real-world deployment, where accurate and trustworthy alerts are critical for municipal operations.
- React.js 18+: Modern component-based UI framework
- Google Maps API: Interactive mapping and visualization
- Socket.IO Client: Real-time communication
- Recharts: Data visualization and analytics
- Framer Motion: Smooth animations and transitions
- Python 3.8+: Core application logic
- Flask 2.0+: RESTful API framework
- Flask-SocketIO: Real-time bidirectional communication
- SQLAlchemy: Database ORM and management
- Google Gemini API: Large language model integration
- GroundingDINO: Zero-shot object detection
- OWL-ViT: Open-vocabulary object detection
- YOLOv8: Specialized anomaly detection models
- BLIP: Image captioning and understanding
- CrossEncoder: Semantic alignment validation
- AWS EC2: Scalable cloud computing
- AWS S3: Object storage for images and data
- MySQL: Relational database management
- Redis: High-performance caching layer
- Docker: Containerization and deployment
- Python 3.8+
- Node.js 16+
- MySQL 8.0+
- Redis 6.0+
- Google Cloud Platform Account (for Street View API)
- AWS Account (for S3 and EC2 deployment)
-
Clone the repository
git clone https://github.com/your-username/CMPE295-Autonomous-Detection.git cd CMPE295-Autonomous-Detection/backend -
Create virtual environment
python -m venv venv # Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env # Edit .env with your configuration -
Initialize database
python -c "from app import db; db.create_all()" -
Start the backend server
python app.py
-
Navigate to frontend directory
cd ../frontend -
Install dependencies
npm install
-
Configure environment variables
cp .env.example .env # Edit .env with your API keys and configuration -
Start development server
npm start
- URL:
http://localhost:3000(Frontend) - API Base:
http://localhost:5000/api(Backend) - Admin Panel: Available for authorized users with administrative privileges
# Example: Start detection stream
POST /api/stream/start
{
"start_coords": {"lat": 37.7749, "lng": -122.4194},
"end_coords": {"lat": 37.7849, "lng": -122.4094},
"model": "grounding_dino",
"detection_types": ["graffiti", "road_damage", "tent"]
}- Live Stream: Monitor real-time detection results
- Interactive Map: View detected anomalies with detailed information
- Heatmap: Visualize anomaly density across urban areas
- Dashboard: Comprehensive overview of system performance
- Charts: Trend analysis and statistical insights
- LLM Reports: AI-generated summaries and recommendations
StreetGuard provides a comprehensive RESTful API for urban anomaly detection, task management, and analytics. All endpoints are prefixed with /api unless otherwise specified.
| Endpoint | Method | Description | Request Body | Response |
|---|---|---|---|---|
/auth/signup/ |
POST | User registration | {"email": "string", "password": "string", "firstname": "string", "lastname": "string"} |
{"message": "Signup successful"} |
/auth/login/ |
POST | User authentication | {"email": "string", "password": "string"} |
{"id": int, "message": "Login successful", "email": "string", "firstname": "string", "lastname": "string", "role": "string", "token": "string"} |
| Endpoint | Method | Description | Query Parameters | Response |
|---|---|---|---|---|
/anomalies |
GET | Retrieve all detection events with metadata | None | Array of events with nested images and metadata |
/anomalies/<event_id> |
DELETE | Delete specific detection event | event_id (path) |
{"message": "Event and its related tasks deleted"} |
/anomalies/images/<image_id> |
DELETE | Delete specific detection image | image_id (path) |
{"message": "Image and its related tasks deleted"} |
/anomalies/metadata/<metadata_id> |
DELETE | Delete specific detection metadata | metadata_id (path) |
{"message": "Metadata and its related tasks deleted"} |
/anomalies/stats |
GET | Get anomaly type statistics | None | {"road_damage": int, "graffiti": int, "tent": int} |
| Endpoint | Method | Description | Query Parameters | Response |
|---|---|---|---|---|
/heatmap/data |
GET | Get heatmap data for visualization | type (optional): graffiti, tent, road damage |
Array of {"lat": float, "lng": float, "weight": int, "type": "string"} |
| Endpoint | Method | Description | Query Parameters | Response |
|---|---|---|---|---|
/chart-data |
GET | Get time-series chart data | type: anomaly type, start: start date, end: end date |
Array of {"date": "string", "count": int} |
| Endpoint | Method | Description | Request Body | Response |
|---|---|---|---|---|
/llm-query |
POST | AI-powered data analysis | {"context": "string", "message": "string"} |
{"reply": "string"} |
/llm-summarize |
POST | Generate automated anomaly summaries | None | {"reply": "string"} |
| Endpoint | Method | Description | Query Parameters | Request Body | Response |
|---|---|---|---|---|---|
/tasks |
GET | Get paginated task list | page, per_page, status, worker_id, sort, order |
None | Paginated tasks with metadata |
/tasks/<task_id> |
PUT | Update specific task | task_id (path) |
{"status": "string", "worker_id": int} |
{"message": "Task updated successfully", "task_id": int} |
/tasks/bulk |
PUT | Bulk update multiple tasks | None | Array of task updates | Array of update results |
/tasks/getWorkers |
GET | Get available workers | None | None | Array of worker objects |
/tasks/<task_id>/label |
PUT | Update task label | task_id (path) |
{"label": "string"} |
{"message": "Label updated successfully", "task_id": int, "new_label": "string"} |
| Endpoint | Method | Description | Query Parameters | Request Body | Response |
|---|---|---|---|---|---|
/getAssignedTasks |
GET | Get tasks assigned to worker | user_id |
None | Array of assigned tasks |
/startTasks |
POST | Mark tasks as in progress | None | {"task_ids": [int], "user_id": int} |
Summary of updated tasks |
/completeTasks |
POST | Mark tasks as completed | None | {"task_ids": [int], "user_id": int} |
Summary of completed tasks |
| Event | Direction | Description | Data Payload | Response |
|---|---|---|---|---|
start_stream |
Client β Server | Initialize detection stream | {"userId": int, "startLatInput": float, "startLngInput": float, "endLatInput": float, "endLngInput": float, "num_points": int, "model": "string"} |
Stream of detection results with images and metadata |
| Code | Description |
|---|---|
| 200 | Success |
| 201 | Created |
| 400 | Bad Request |
| 401 | Unauthorized |
| 404 | Not Found |
| 409 | Conflict |
| 500 | Internal Server Error |
{
"id": "int",
"latitude": "string",
"longitude": "string",
"timestamp": "ISO 8601 string",
"street": "string",
"city": "string",
"state": "string",
"zipcode": "string",
"images": ["DetectionImage"]
}{
"id": "int",
"direction": "string (front|back|left|right)",
"image_url": "string (S3 URL)",
"metadatas": ["DetectionMetadata"]
}{
"id": "int",
"X1_loc": "float",
"Y1_loc": "float",
"X2_loc": "float",
"Y2_loc": "float",
"label": "string",
"score": "float",
"type": "string (graffiti|tent|road_damage)",
"caption": "string"
}{
"task_id": "int",
"status": "string (created|assigned|in_progress|completed)",
"worker_id": "int",
"notes": "string",
"scheduled_time": "ISO 8601 string",
"created_at": "ISO 8601 string",
"updated_at": "ISO 8601 string"
}page: Page number (default: 1)per_page: Items per page (default: 10)
status: Task status filterworker_id: Filter by assigned workertype: Anomaly type filter for heatmaps
sort: Sort field (created_at,updated_at,scheduled_time)order: Sort order (ascordesc)
- Authentication: JWT tokens required for protected endpoints
- Token Expiry: 24 hours from issuance
- Rate Limiting: Currently no rate limiting implemented
- CORS: Enabled for cross-origin requests
For complete API documentation and examples, see API_REFERENCE.md
| Model | Precision | Recall | F1-Score | Inference Time |
|---|---|---|---|---|
| YOLOv8 | 0.47 | 0.78 | 0.49 | 0.055s |
| GroundingDINO | 0.10 | 0.83 | 0.30 | 0.551s |
| OWL-ViT | 0.12 | 0.75 | 0.42 | 0.157s |
- Throughput: Up to 18.06 images/second (YOLOv8)
- Latency: 0.055s end-to-end processing (YOLOv8)
- Accuracy: 85%+ for trained anomaly categories
- Scalability: Supports concurrent processing of multiple detection streams
-
Launch EC2 Instances
- Frontend: t3.large (React application)
- Backend: g5.xlarge (Flask + AI models)
- Database: t3.micro (MySQL)
-
Configure Security Groups
- Allow HTTP/HTTPS traffic
- Configure database access rules
- Set up VPC for internal communication
-
Deploy Application
# Backend deployment git clone <repository> pip install -r requirements.txt gunicorn -w 4 -b 0.0.0.0:5000 app:app # Frontend deployment npm run build serve -s build -l 3000
# Build and run with Docker Compose
docker-compose up -dWe welcome contributions to improve StreetGuard! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 for Python code
- Use ESLint for JavaScript/React code
- Write comprehensive tests for new features
- Update documentation for API changes
This project is licensed under the MIT License - see the LICENSE file for details.
CMPE 295 - Autonomous Detection Project Team
- Haoming Chen - AI/ML Pipeline Development
- Jiajun Dai - Backend Architecture & API Development
- Rachel Fan - Frontend Development & UI/UX
- Vinh Tran - System Integration & Deployment
Project Advisor: Professor Kaikai Liu, San Jose State University
- Documentation: Wiki
- Issues: GitHub Issues
- Email: project-email@example.com
Special thanks to:
- San Jose State University for academic support
- Google Cloud Platform for Street View API access
- Hugging Face for pre-trained AI models
- Open Source Community for foundational technologies
StreetGuard - Transforming urban maintenance through intelligent automation and real-time monitoring.
Built with β€οΈ by the CMPE 295 team at San Jose State University





