Skip to content

cristina86cristina/stability_selection_example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

---
title: "README"
author: "Cristina Venturini"
date: "`r Sys.Date()`"
output:
  prettydoc::html_pretty:
    theme: architect
    highlight: github
---

This repository contains the scripts to run the stability selection. 

Folder scripts/ includes the main script stability_selection_example.py and stability_shuffle_labels.py to run the stability selection with shuffled labels. The scripts need to be modified prior running with your own data. 

##Parameters: 
- iter_num: number of iterations (in our own datasets we use 10000)
- sampled_features: number of features (genes) to include in each iteration (we usually do 10% of the total number of genes)
- k_subsample: k in k-fold cross validation (usually 5)
- data_tot= name of your input file in csv format (an example of input file can be found in the data/ folder)
- label = np.squeeze(label.reshape(x,1)) = instead of x, insert the n of samples
- data = data_tot.loc[:,'A2M':].values = instead of A2M, insert the first gene/feature in your dataset

Run from the command line: (tested in Python 2.7)
```{bash}
python scripts/stability_selection_example.py 
```

##Data input: 
Example of a data file can be found in data/. Input file must be a .csv file with samples as rows and gene/features as columns. First column is sampleid "Patient" and second column the label for classification "Class"

About

Scripts for stability selection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages