Skip to content

BennettSchwartz/APW-DTE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

APW-DTE: Adaptive Public Works Dispatch & Triage Engine

Presidential AI Challenge Submission

A machine learning-powered optimization system for intelligent routing and prioritization of public works service requests. This system combines predictive modeling with constraint-based optimization to improve response times, resource utilization, and service equity.

Overview

APW-DTE addresses the challenge of efficiently dispatching municipal maintenance crews to service requests (potholes, streetlights, traffic signals, etc.) under real-world constraints including:

  • Skill matching requirements
  • Shift time windows
  • Geographic distribution
  • Priority-based urgency
  • Travel time optimization
  • Workload fairness across neighborhoods

The system uses a two-stage approach:

  1. ML Prediction: LightGBM models predict escalation risk and job duration
  2. Optimization: Google OR-Tools Vehicle Routing Problem solver generates optimal crew assignments

Key Features

  • Priority Scoring: Predict which requests are likely to escalate using gradient boosting classification
  • Duration Estimation: Predict job completion times with quantile regression for uncertainty bounds
  • Multi-Objective Optimization: Balance completion time, travel distance, and fairness
  • Skill Constraints: Ensure crews are assigned only to jobs they can handle
  • Rolling-Horizon Planning: Re-optimize every 30 minutes to handle dynamic arrivals
  • Statistical Validation: Compare against baseline strategies with t-tests
  • Interactive Dashboard: Streamlit UI for visualization and comparison

Architecture

APW-DTE/
├── data/
│   ├── raw/              # Generated synthetic data
│   ├── models/           # Trained ML models
│   └── simulation_results/
├── src/apw_dte/
│   ├── data/
│   │   ├── schemas.py    # Pydantic data models
│   │   └── generators/   # Synthetic data generation
│   ├── models/
│   │   ├── features.py   # Feature engineering
│   │   ├── priority_scorer.py
│   │   └── duration_estimator.py
│   ├── optimization/
│   │   └── dispatcher.py # OR-Tools VRP solver
│   ├── simulation/
│   │   ├── engine.py     # Main simulation loop
│   │   ├── environment.py # State tracking
│   │   ├── baseline.py   # Comparison strategies
│   │   └── metrics.py    # Performance calculation
│   └── ui/
│       └── app.py        # Streamlit dashboard
├── scripts/
│   ├── generate_data.py  # Create synthetic dataset
│   ├── train_models.py   # Train ML models
│   └── run_simulation.py # Execute comparison
└── tests/

Installation

Requirements

  • Python 3.10+
  • Virtual environment (recommended)

Setup

# Clone repository
git clone <repository-url>
cd APW-DTE

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

Dependencies

Core libraries:

  • pandas, numpy, scipy: Data manipulation and numerical computing
  • scikit-learn: ML utilities and metrics
  • lightgbm: Gradient boosting models
  • networkx: Road network graphs
  • ortools: Constraint programming and optimization
  • pydantic: Type-safe data validation

UI and visualization:

  • streamlit: Interactive dashboard
  • plotly: Charts and visualizations

Quick Start

1. Generate Synthetic Data

python scripts/generate_data.py

This creates:

  • 14,532 service requests over 60 days
  • 10 crews with varied skills and shifts
  • 10km x 10km road network (441 nodes, 844 edges)

Output location: data/raw/

2. Train ML Models

python scripts/train_models.py

Trains two models:

  • Priority Scorer: Binary classifier (AUC > 0.75 target)
  • Duration Estimator: Regression model (R² > 0.70 target)

Output location: data/models/

3. Run Simulation

python scripts/run_simulation.py

Executes 7-day comparative simulation:

  • Baseline: FIFO dispatcher
  • AI: OR-Tools VRP optimizer

Outputs statistical comparison and saves results to data/simulation_results/

4. Launch Dashboard

streamlit run src/apw_dte/ui/app.py

Or use the convenience script:

./scripts/run_ui.sh

Access at http://localhost:8501

Usage

Custom Simulation

from apw_dte.simulation.engine import SimulationEngine
from apw_dte.optimization.dispatcher import ORToolsDispatcher

# Initialize dispatcher
dispatcher = ORToolsDispatcher(
    road_network=network,
    priority_weight=10.0,
    search_time_limit=30
)

# Create simulation
engine = SimulationEngine(
    dispatcher=dispatcher,
    crews=crews,
    start_time=datetime(2024, 1, 1, 6, 0),
    end_time=datetime(2024, 1, 8, 6, 0),
    road_network=network,
    ml_models=models
)

# Run
engine.load_requests(requests)
metrics = engine.run()

Baseline Dispatchers

Three baseline strategies are provided for comparison:

from apw_dte.simulation.baseline import (
    FIFODispatcher,
    PriorityDispatcher,
    NearestDispatcher
)

# First-In-First-Out
fifo = FIFODispatcher()

# Priority-based (safety-critical + escalation rate + age)
priority = PriorityDispatcher()

# Greedy nearest-neighbor
nearest = NearestDispatcher()

Performance Metrics

The system tracks comprehensive performance indicators:

Time Metrics

  • Average wait time (minutes)
  • Median wait time
  • 95th percentile wait time
  • Average completion time

Efficiency Metrics

  • Total travel time
  • Average travel per request
  • Crew utilization rate
  • Completion rate

Fairness Metrics

  • Neighborhood disparity (std dev of wait times by area)
  • Priority adherence (% high-priority requests completed within 2 hours)

Statistical Tests

  • Independent t-test for wait time comparison
  • p-value < 0.05 threshold for significance

Model Performance

Current model performance on synthetic data:

Model Metric Target Achieved Status
Priority Scorer AUC > 0.75 0.777 Pass
Priority Scorer Precision@10% - 0.52 -
Duration Estimator > 0.70 0.874 Pass
Duration Estimator RMSE - 9.6 min -

OR-Tools VRP Configuration

The optimization solver includes:

Constraints

  • Skill matching (electrical, asphalt, signage, general maintenance)
  • Shift time windows (8-hour capacity)
  • Service time at each location

Objective Function

  • Minimize total travel time
  • Weighted priority penalties for urgent requests
  • Disjunction penalties for unserved requests

Search Strategy

  • First solution: PATH_CHEAPEST_ARC
  • Metaheuristic: Guided Local Search
  • Time limit: 30 seconds per optimization cycle

Testing

Run tests with:

python tests/test_simulation.py

Tests cover:

  • Data loading (crews, requests)
  • Dispatcher functionality (FIFO, OR-Tools)
  • Skill matching constraints
  • End-to-end simulation

Data Schema

ServiceRequest

request_id: str
category: RequestCategory  # pothole, streetlight, etc.
latitude: float
longitude: float
time_reported: datetime
neighborhood: str
historical_escalation_rate: float
weather_conditions: WeatherCondition
priority_score: Optional[float]
estimated_duration: Optional[float]
escalation_event: bool

Crew

crew_id: str
skills: Set[Skill]
equipment: Set[Equipment]
shift_start: time
shift_end: time
current_location: Tuple[float, float]

Configuration

Key parameters can be adjusted:

Data Generation (scripts/generate_data.py)

  • num_days: Simulation period length
  • requests_per_day: Request arrival rate
  • num_crews: Crew count
  • city_size_km: Geographic area

ML Training (scripts/train_models.py)

  • n_estimators: Number of boosting rounds
  • max_depth: Tree depth
  • learning_rate: Step size

Optimization (src/apw_dte/optimization/dispatcher.py)

  • priority_weight: Weight for priority in objective
  • search_time_limit: Solver time limit (seconds)

Simulation (src/apw_dte/simulation/engine.py)

  • reoptimize_interval_minutes: How often to re-optimize
  • time_step_minutes: Simulation granularity

Design Decisions

Why LightGBM?

  • Fast training and inference
  • Native categorical feature support
  • Good performance without extensive tuning
  • Smaller model size than ensembles

Why OR-Tools VRP?

  • Production-ready constraint solver
  • Built-in support for time windows and capacity
  • Efficient local search metaheuristics
  • Handles skill constraints via vehicle restrictions

Why Synthetic Data?

  • Reproducible experiments
  • Controlled feature relationships
  • No privacy concerns
  • Realistic temporal and spatial patterns

Time-Based Train/Test Split

  • Prevents data leakage
  • Reflects real-world deployment (past -> future)
  • 82.5% train, 17.5% test split

Performance Benchmarks

Typical simulation results (7-day period, ~1,750 requests):

Dispatcher Avg Wait Time Completion Rate Travel Efficiency
FIFO ~85 min 95% Baseline
Priority ~80 min 96% +5%
OR-Tools ~72 min 97% +15%

Statistical significance: p < 0.001 for wait time improvement

Limitations

  • Synthetic data may not capture all real-world complexities
  • Road network is simplified grid (not actual street maps)
  • No crew fatigue or break modeling
  • No emergency/safety-critical preemption
  • Fixed shift schedules (no overtime)
  • Perfect information assumption (no delays or cancellations)

Future Enhancements

  • Integration with real municipal data APIs
  • Real-time traffic condition updates
  • Multi-day lookahead planning
  • Crew preference learning
  • Weather forecast integration
  • Mobile app for crew dispatch notifications

License

MIT License

Citation

If you use this system in your research or projects, please cite:

APW-DTE: Adaptive Public Works Dispatch & Triage Engine
https://github.com/<username>/APW-DTE

Contact

For questions, issues, or contributions, please open an issue on GitHub.

Acknowledgments

Built with:

  • Google OR-Tools for optimization
  • Microsoft LightGBM for machine learning
  • Streamlit for interactive visualization

About

A machine learning-powered optimization system for intelligent routing and prioritization of public works service requests.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors