UCSC Engineering 2020

This project uses machine learning to predict the diagnoses of patients with congestive heart failure or patients with myocardial infarction. While it focuses primarily on these two conditions, the code can be abstracted to any number of diseases or conditions within Synthea.

Data Generation

Synthea is a tool for generating patient json files in FHIR format. The medical history of each patient is randomly generated using the modules supplied by Synthea. These modules can be viewed and edited with the Synthea Module Builder; however, we left these modules unchanged when generating our dataset.

Data Pre-processing & Transformation

Files

synthea_data_pipeline.ipynb

Brief Summary

synthea_data_pipeline.ipynb is a jupyter notebook that takes a folder of Synthea generated json files as input. All relevant medical data is extracted for each patient and transformed into a machine learning model readable format called embeddings. These embeddings are exported as csv and npy files as output, ready to be sent to Model Testing & Training.

Overview

Model Testing & Training

Files

training.py
- python3 training.py <folder with csvs> <label>
w2vtrain.py
- python3 w2vtrain.py <folder containing npy files>

Brief Summary

The output csv and npy folders are used as input for training.py and w2vtrain.py. Both scripts use various machine learning models to create a prediction accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
Data Preprocessing and Tranformation		Data Preprocessing and Tranformation
Documentation		Documentation
Model Training and Testing		Model Training and Testing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UCSC Engineering 2020

Data Generation

Data Pre-processing & Transformation

Files

Brief Summary

Overview

Model Testing & Training

Files

Brief Summary

Overview

About

Uh oh!

Releases

Packages

Contributors 7

Uh oh!

Languages

License

anthem-ai/ucsc-engineering-2020

Folders and files

Latest commit

History

Repository files navigation

UCSC Engineering 2020

Data Generation

Data Pre-processing & Transformation

Files

Brief Summary

Overview

Model Testing & Training

Files

Brief Summary

Overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Uh oh!

Languages

Packages