Skip to content

thanvigachar/earthquake-activity-classification

Repository files navigation

Earthquake Magnitude Prediction Model

🔹 Problem Statement

Predict the magnitude of earthquakes using historical earthquake event data.
Accurate magnitude prediction helps assess potential risks, improve disaster preparedness, and support seismological research.


Dataset Description

Source: Kaggle – Earthquake Dataset
Target Variable: magnitude
Features used:

  • longitude, latitude – location of the earthquake
  • year, month – time-based features
  • depth, sig, nst, dmin, rms, gap, tsunami – seismological attributes
  • magType, type – event type indicators

Exploratory Data Analysis (EDA)

Performed detailed analysis to understand relationships and data trends:

  • Distribution of Magnitudes: Most earthquakes have magnitudes between 3.0 and 6.0.
  • Depth vs Magnitude Correlation: Deeper earthquakes show slightly lower magnitudes on average.
  • Regional Patterns: Higher activity clusters observed near specific latitude-longitude ranges.
  • Temporal Trends: Magnitude patterns analyzed over time using year and month.

Key visualizations include:

  • Magnitude distribution histogram
  • Scatter plot of depth vs magnitude
  • Heatmap of feature correlations
  • Map of earthquake locations (longitude vs latitude)

Machine Learning Models

Model RMSE MAE R² Score
Gradient Boosting 0.233 0.0299 0.9184
Random Forest 0.238 0.0277 0.9150
Ridge Regression 0.268 0.130 0.8920
Linear Regression 0.268 0.130 0.8920
Lasso Regression 0.272 0.131 0.8889

Best Model: Gradient Boosting Regressor (R² = 0.918)

It achieved the highest R² score (0.92) and lowest RMSE (0.23), showing strong predictive accuracy.


Feature Importance (from Gradient Boosting)

Top influential features for magnitude prediction:

  1. sig (significance factor)
  2. dmin (distance to nearest station)
  3. tsunami
  4. rms (root mean square of travel time residuals)
  5. magType

Visualized below:
Feature Importance

  • Actual vs Predicted Magnitude (Scatter Plot)
  • Residual Distribution (Prediction Errors)
  • Feature Importance (Gradient Boosting)

Model Insights

  • The sig feature dominates prediction importance, indicating event significance is a strong indicator of magnitude.
  • Geographical coordinates (latitude, longitude) contribute less, suggesting local variations are less dominant than seismic parameters.
  • Ensemble models (Random Forest, Gradient Boosting) outperform linear models due to their ability to capture nonlinear relationships.

Evaluation: Actual vs Predicted Magnitude

The actual vs predicted plot shows strong alignment:

  • Points are closely scattered around the diagonal line (ideal prediction).
  • Gradient Boosting demonstrates minimal deviation and low residual error.

Results & Insights

  • Gradient Boosting performed best with lowest RMSE (0.23) and highest R² (0.92).
  • Geographic location (longitude, latitude) and seismic station parameters (nst, dmin) were most influential.
  • Linear models underperformed compared to ensemble methods.

Technologies Used

  • Language: Python 3.x
  • Libraries:
  • pandas, numpy
  • matplotlib, seaborn
  • scikit-learn

How to Run

  1. Clone this repository:
    git clone https://github.com/<your-username>/earthquake-prediction.git
  2. Run the Jupyter Notebook jupyter notebook earthquake_magnitude_predictions.ipynb

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors