Skip to content

Final Report Peer Review #100

@cmh342

Description

@cmh342

This project aims to predict the levels of damage to buildings in Nepal as a result of earthquakes, and uses a dataset called “Richter’s Predictor: Modeling Earthquake Damage.” The goals of this project are to do EDA and examine the correlation between damage grade (on a scale of 1 to 3) and other variables, as well as to fit multiple regression models and fit decision trees.

The group standardized some variables, as well as converted some others to binary. In doing so, they were able to get a clear understanding of which variables played important roles in predictions and which did not. In a correlation heatmap, the group found the variables had the highest correlation with damage grade. They then generated scatter plots and histograms to visually represent the data in a clear and orderly fashion. From these plots, they were able to identify skews in the distributions. Though their regularized regression models were not as accurate as they had hoped, they were able to identify and address reasons that contributed to the shortcomings. The multinomial linear regression model used hyperparameter tuning on 5 validation sets, and after refitting the optimal hyperparameters got an accuracy of 74.1%. The decision trees and random forests used had an accuracy of 88% for the training data and 72% for the testing data.

What I liked:
All methods are well explained, and the methods used show deep understanding of the class material and the implications of using the specific models.
I appreciated the inclusion of some of the equations, as it reminded me exactly of what the functions represented and what values they were working with.
The depth in the discussions of results was great. I was able to clearly understand how all of the identified features affected the overall outcomes of the models.

What could use improvement:
This is a tough section since I think this project is very well done!
Including some of the plots off to the side instead of referring to the appendix could be helpful (though with the page limit I see why this decision was made).
For the confusion matrices in Figure 5, some of the colors might be hard to differentiate for some, so a different color scale might be useful.

Overall, I think this is a great project and it was very fun to read and see how your models performed. Good job and enjoy your break!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions