Skip to content

lisaguim/PredictiveModel_BreastCancer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Predictive Model for Breast Cancer

This project developed in the Jupyter notebook aims to create a predictive model to predict when a person has cancer given the micro RNA sequencing exam based on TCGA and to study the techniques of data dimensionality reduction (PCA), logistic regression and model training.

Data Description

Data were collected from The Cancer Genome Atlas Program (TCGA), which is an international and world-class program for characterizing more than 33 types of cancer. The data are real and have been properly anonymized. Each row represents a sample taken from a person. The columns are the types of microRNA and each entry represents the intensity with which that microRNA is expressed. Expression values ​​range from [0, infinity]. Values ​​close to zero indicate low expression, while the opposite indicates high expression. The data also have labels (see class attribute), with TP (primary solid tumor) indicating tumor and NT (normal tissue) indicating no tumor.

Source dataset and Reference

Language Used

Python

Libraries

  • Pandas
  • Numpy
  • Seaborn
  • Matplotlib
  • Sklearn

About

Predictive model to predict when a person has cancer given the micro RNA sequencing exam.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published