Skip to content

Abdulkadirklc/Graph-Theory-Project

Repository files navigation

Istanbul Metro Passenger Forecasting using Graph Neural Networks

This repository contains the implementation of Spatio-Temporal Graph Neural Networks (STGCN and Graph WaveNet) to forecast hourly passenger traffic across the Istanbul Metro network. The project aims to capture complex spatial and temporal dependencies in urban transit data to improve operational planning.

📂 Dataset

The data is sourced from the Istanbul Metropolitan Municipality (IBB) Open Data Portal.

How to Setup Data

  1. Download the relevant CSV files for August and September 2024 from the link above.
  2. Place the raw CSV files in the root directory.

🚀 Project Workflow

The project is structured into four sequential notebooks. Please run them in the following order:

1. Data Preparation

This is the initial cleaning pipeline that transforms raw, inconsistent logs into a unified format.

  • Data Ingestion: Loads raw rail system data for August and September 2024.
  • Station Normalization (Critical Step): Implements a robust normalize_station_name() function to unify inconsistent naming conventions.
    • Example: Merges "YENIKAPI 1" and "YENIKAPI 2" into a single "YENIKAPI" node.
    • Fixes: Corrects prefixes and typos (e.g., "M9 HALKALI GUNEY" → "HALKALI CAD").
  • Filtering: Removes inter-city TCDD lines to focus strictly on the urban Metro, Tram, and Marmaray network.
  • Line-Station Mapping: Extracts the hierarchy of lines and their respective stations.
  • Output: Generates cleaned datasets (cleaned_august_rail.csv, cleaned_september_rail.csv) and a station list (august_line_stations.csv).

2. Graph Topology Construction

Notebook: ordering_lines.ipynb

To utilize Graph Neural Networks, we must define the physical structure of the network.

  • Line Ordering: Organizes stations based on their actual sequence on the metro lines (e.g., M1, M2, Marmaray).
  • Adjacency Matrix: Constructs the binary adjacency matrix representing physical rail connections between stations. This matrix is essential for the STGCN model.

3. Baseline Modeling: STGCN

Notebook: STGCN.ipynb

Implementation of Spatio-Temporal Graph Convolutional Networks.

  • Uses the pre-defined physical adjacency matrix.
  • Serves as a lightweight, efficient baseline model.
  • Key Insight: While efficient, STGCN tends to struggle with "shock" events and exhibits an under-prediction bias during extreme peak hours.

4. Advanced Modeling: Graph WaveNet

Notebook: Graph_WaveNet.ipynb

Implementation of Graph WaveNet, the superior architecture for this dataset.

  • Adaptive Adjacency: Unlike STGCN, this model learns a self-adaptive adjacency matrix, uncovering hidden spatial dependencies (e.g., correlations between unconnected business districts) without relying solely on physical rail lines.
  • Dilated Convolutions: Captures long-range temporal patterns more effectively.
  • Results: Demonstrates superior robustness, lower RMSE, and an unbiased ability to capture high-amplitude traffic peaks compared to STGCN.

📊 Results Summary

Model MAE RMSE Key Observation
STGCN 100.23 271.18 Efficient but suffers from under-prediction bias at peaks.
Graph WaveNet 86.73 199.79 State-of-the-art performance. Successfully captures peak amplitudes and hidden spatial correlations.

🛠 Requirements

  • Python 3.8+
  • PyTorch
  • Pandas & NumPy
  • Matplotlib & Seaborn
  • NetworkX (for graph visualization)

About

A project about analysing and forecasting passenger flow in Istanbul railway network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors