Skip to content

Final Report Peer Review #99

@JudyZ98

Description

@JudyZ98

This project aims to find the most suitable for NY citizens to treat Septicemia. They use two types of data: inpatient and hospital data.

Things I like:

  1. They combined three features and create a new feature for data preprocessing part.
  2. For categorical data transformations, they use different encoding methods based on the type of data. They also deal with miscellaneous data in a clever way.
  3. I like the way they visualize how “fair” their model is by showing the cost distribution for different groups.

Suggestions:

  1. I’m kind of confused about this sentence: “some features were selected by ourselves according to the assumptions on features which could have hi effects on the cost of visiting hospitals for each patient”. What method do your team use exactly to choose these features? Is it by correlation coefficient, data visualization or simply by random guess that some features are bound to be significant in predicting y? I think it would be better if you can elaborate more.
  2. TYPO: Fairness “Matrics”
  3. I got lost in the part where your team using random forest for feature selection. How were the feature chosen? Why are you showing Fig.4? It seems that you does not elaborate what does Fig.4 explain in your report.
    I’d like more interpretation for your final models. You compare your models with assumptions made at the very beginning, which is good. But I’m still confused which features are the most important. Maybe you can use a plot to visualize the feature importance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions