Skip to content

VasiaMata/AirQualityMonitoring

Repository files navigation

GDP Growth and Air Pollution (1700–2016)

Team Members: Sofia Zanti, Vasilisa Matafonova, Tim Marchenko

Introduction

This project explores the relationship between GDP growth and air pollution levels, specifically focusing on Suspended Particulate Matter (SPM) over the period from 1700 to 2016.

We aimed to understand how economic development has historically influenced air pollution and whether this relationship can be captured through various machine learning techniques.


Approach

  • Collected and cleaned historical data (1700–2016)
  • Used scatter plots and cubic regressions for visualization
  • Calculated correlation coefficients
  • Tested multiple machine learning models and founf the best one

Expected Outcome

By the end of this project, we aim to achieve the following:

  • Data Visualization: Generate graphs and data frames illustrating historical air quality changes.
  • Analysis Report: Document methodologies, results, and insights.

Strategy

  1. Research: Understand the problem and gather background information.
  2. Data Collection: Identify and acquire relevant datasets, and make changes to the idea of the project if absolutely necessary.
  3. Data Processing: Clean, structure, and preprocess data (handling missing values, formatting).
  4. Data Visualization: Use matplotlib and/or seaborn to create linear regression graphs and other visualizations.
  5. Report Writing: Document findings, methodologies, and analysis.
  6. Prediction Modeling: Use historical data to forecast future air quality trends.

Tools and Techniques

  • Matplotlib + Seaborn: Visualization of data
  • Pandas: Data manipulation and structuring
  • NumPy: Data calculations and predictions
  • Testing Metrics: In order to document how the models are performing
  • Google Docs: Report writing and documentation
  • GitHub: Version control and project tracking

Key Findings

  • Pollution follows the Environmental Kuznets Curve meaning it rises with GDP up to a point, then declines
  • Linear models are inadequate for complex, non-linear relationships
  • Ensemble models, particularly Random Forest, perform best

Main Factors and Solutions

  • Early Pollution: Coal use. solution: Switch to solar/wind
  • Industrial Era: More coal with GDP growth. Solution: Promote clean tech
  • Modern Era: CO2 from cars. Solution: Encourage public transport, bike-friendly cities
  • Regulatory Gaps. Solution: Use AI for monitoring & enforcement

Possibilities for future improvements

  • Add more features
  • Apply time-series oriented models
  • Use hyperparameter tuning

About

This is a repository for our python course project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors