Earthquake Magnitude Prediction Model

🔹 Problem Statement

Predict the magnitude of earthquakes using historical earthquake event data.
Accurate magnitude prediction helps assess potential risks, improve disaster preparedness, and support seismological research.

Dataset Description

Source: Kaggle – Earthquake Dataset
Target Variable: magnitude
Features used:

longitude, latitude – location of the earthquake
year, month – time-based features
depth, sig, nst, dmin, rms, gap, tsunami – seismological attributes
magType, type – event type indicators

Exploratory Data Analysis (EDA)

Performed detailed analysis to understand relationships and data trends:

Distribution of Magnitudes: Most earthquakes have magnitudes between 3.0 and 6.0.
Depth vs Magnitude Correlation: Deeper earthquakes show slightly lower magnitudes on average.
Regional Patterns: Higher activity clusters observed near specific latitude-longitude ranges.
Temporal Trends: Magnitude patterns analyzed over time using year and month.

Key visualizations include:

Magnitude distribution histogram
Scatter plot of depth vs magnitude
Heatmap of feature correlations
Map of earthquake locations (longitude vs latitude)

Machine Learning Models

Model	RMSE	MAE	R² Score
Gradient Boosting	0.233	0.0299	0.9184
Random Forest	0.238	0.0277	0.9150
Ridge Regression	0.268	0.130	0.8920
Linear Regression	0.268	0.130	0.8920
Lasso Regression	0.272	0.131	0.8889

Best Model: Gradient Boosting Regressor (R² = 0.918)

It achieved the highest R² score (0.92) and lowest RMSE (0.23), showing strong predictive accuracy.

Feature Importance (from Gradient Boosting)

Top influential features for magnitude prediction:

sig (significance factor)
dmin (distance to nearest station)
tsunami
rms (root mean square of travel time residuals)
magType

Visualized below:

Actual vs Predicted Magnitude (Scatter Plot)
Residual Distribution (Prediction Errors)
Feature Importance (Gradient Boosting)

Model Insights

The sig feature dominates prediction importance, indicating event significance is a strong indicator of magnitude.
Geographical coordinates (latitude, longitude) contribute less, suggesting local variations are less dominant than seismic parameters.
Ensemble models (Random Forest, Gradient Boosting) outperform linear models due to their ability to capture nonlinear relationships.

Evaluation: Actual vs Predicted Magnitude

The actual vs predicted plot shows strong alignment:

Points are closely scattered around the diagonal line (ideal prediction).
Gradient Boosting demonstrates minimal deviation and low residual error.

Results & Insights

Gradient Boosting performed best with lowest RMSE (0.23) and highest R² (0.92).
Geographic location (longitude, latitude) and seismic station parameters (nst, dmin) were most influential.
Linear models underperformed compared to ensemble methods.

Technologies Used

Language: Python 3.x
Libraries:
pandas, numpy
matplotlib, seaborn
scikit-learn

How to Run

Clone this repository:

git clone https://github.com/<your-username>/earthquake-prediction.git

Run the Jupyter Notebook jupyter notebook earthquake_magnitude_predictions.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
dataset		dataset
earthquake-activity-classification		earthquake-activity-classification
earthquake_magnitude_predictions.ipynb		earthquake_magnitude_predictions.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Earthquake Magnitude Prediction Model

🔹 Problem Statement

Dataset Description

Exploratory Data Analysis (EDA)

Machine Learning Models

Feature Importance (from Gradient Boosting)

Model Insights

Evaluation: Actual vs Predicted Magnitude

Results & Insights

Technologies Used

How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Earthquake Magnitude Prediction Model

🔹 Problem Statement

Dataset Description

Exploratory Data Analysis (EDA)

Machine Learning Models

Feature Importance (from Gradient Boosting)

Model Insights

Evaluation: Actual vs Predicted Magnitude

Results & Insights

Technologies Used

How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages