Skip to content

deep-kapadia-6/LSC-Identification-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🧬 LSC Identification Model: Automated Training & Feature Engineering Pipeline

ML-Driven Precision Oncology: Automated Leukemic Stem Cell (LSC) Identification

📌 Project Overview

Distinguishing Leukemic Stem Cells (LSCs) from non-leukemic cells is a critical bottleneck in treating T-cell lymphoblastic leukemia (T-ALL). This project implements a high-precision classification model that integrates multi-omic signatures—including chromatin opening, RNA, and surface epitopes—to automate identification and accelerate therapeutic pathfinding.

🚀 Key Results

  • High-Fidelity Accuracy: Achieved >95% accuracy (with a test set performance of 99.9%) in identifying LSCs from novel transcriptomic inputs.

  • Clinical Impact: Successfully identified driver mutations in Minimal Residual Disease (MRD) datasets, providing a scalable framework for personalized medicine.

  • Multi-Platform Robustness: Validated across diverse high-throughput sequencing technologies, including TEA-seq, CITE-seq, and scRNA-seq.

🛠️ Technical Implementation

  • Core Architecture: Stochastic Gradient Descent (SGD) based classifier engineered via the CellTypist framework.

  • Signature Extraction: Automated feature selection using global chromatin opening as a primary molecular marker.

  • Pre-processing: Implemented custom normalization and scaling (MinMaxScaler) pipelines for high-dimensional genomic data.

📂 Repository Structure

Plaintext

├── LSC_Model_Training.ipynb # Full training pipeline and feature extraction ├── requirements.txt # Environment dependencies └── README.md # Project documentation

⚙️ Getting Started

Prerequisites:

  1. Python 3.9+
  2. High-Performance Computing (HPC) environment recommended for multi-terabyte datasets

Installation:

Bash git clone https://github.com/yourusername/LSC-Identification-Model.git pip install -r requirements.txt

About

High-accuracy ML model for leukemic stem cell (LSC) identification from single-cell multi-omics data (TEA‑seq, CITE‑seq, scRNA‑seq).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors