NYC Scout: Taxi Fare Predictor and AI Assistant

End-to-end ML system for NYC taxi fare prediction with multi-modal capabilities

A professional ML portfolio project featuring:

🎯 XGBoost fare prediction with hyperparameter tuning
🚀 Distributed training on Google Vertex AI
🤖 RAG-powered NYC attractions chatbot
🎙️ Multi-modal API (voice, text, chat)
📊 Comprehensive data analysis with Plotly

🚀 Quick Start

Prerequisites

Python 3.12+
Google Cloud Platform account
gcloud CLI installed and authenticated

Installation

# Clone repository
git clone https://github.com/misran3/nyc-scout.git

# Create virtual environment
python -m venv venv
source venv/bin/activate 

# Install package
pip install -e .

# Copy environment template and edit with project details
cp .env.example .env

Setup GCP Infrastructure

# Authenticate with GCP
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# Run setup script to enable required APIs and create buckets
python scripts/setup_gcp_infrastructure.py \
  --project-id YOUR_PROJECT_ID \
  --region us-east1

Train a Model

# Local training (for testing)
python scripts/train.py \
  --max_depth 6 \
  --learning_rate 0.1 \
  --subsample 1.0 \
  --n_estimators 100

# Upload training package to GCS
python setup.py sdist
gsutil cp dist/*.tar.gz gs://YOUR_BUCKET/nyc-fare-predictor/dist/

# Launch Vertex AI hyperparameter tuning
gcloud ai hp-tuning-jobs create \
  --region=us-east1 \
  --display-name=nyc-fare-tuning-$(date +%m%d_%H%M) \
  --config=config/vertex_ai_training.yaml

Deploy Model

# Deploy trained model to Vertex AI endpoint
python scripts/deploy_model.py \
  --model-path gs://YOUR_BUCKET/path/to/model.bst \
  --model-name nyc-fare-xgboost \
  --endpoint-name nyc-fare-endpoint

# Update .env with endpoint ID
# VERTEX_AI_ENDPOINT_ID=<your-endpoint-id>

Setup RAG Knowledge Base

# Setup RAG corpus and upload knowledge base
python scripts/rag_pipeline.py --setup

# Test RAG system
python scripts/rag_pipeline.py --query "Tell me about museums in NYC"

📊 Architecture

Project Structure

nyc-scout/
├── src/                   # Core library code
│   ├── config.py          # Configuration management
│   ├── features/          # Feature engineering
│   ├── data/              # Data loading
│   ├── gcp/               # GCP utilities
│   └── rag/               # RAG pipeline
├── scripts/               # Executable scripts
│   ├── train.py           # Model training
│   ├── deploy_model.py    # Model deployment
│   ├── rag_pipeline.py    # RAG setup
│   └── setup_gcp_infrastructure.py
├── api/                   # Flask API (to be completed)
├── notebooks/             # Data analysis (to be completed)
├── knowledge_base/        # NYC attractions (17 files)
├── config/                # Configuration files
├── infrastructure/        # Deployment scripts (to be completed)
└── data/                  # Training data (Source: https://www.kaggle.com/competitions/new-york-city-taxi-fare-prediction)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NYC Scout: Taxi Fare Predictor and AI Assistant

🚀 Quick Start

Prerequisites

Installation

Setup GCP Infrastructure

Train a Model

Deploy Model

Setup RAG Knowledge Base

📊 Architecture

Project Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
api		api
config		config
data		data
infrastructure		infrastructure
knowledge_base		knowledge_base
notebooks		notebooks
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

License

misran3/nyc-scout

Folders and files

Latest commit

History

Repository files navigation

NYC Scout: Taxi Fare Predictor and AI Assistant

🚀 Quick Start

Prerequisites

Installation

Setup GCP Infrastructure

Train a Model

Deploy Model

Setup RAG Knowledge Base

📊 Architecture

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages