This tool automatically extracts Lung RADS (Lung Imaging Reporting and Data System) scores from medical notes using AI. It reads a CSV file containing medical notes and outputs a CSV file with the extracted scores.
Before you begin, make sure you have the following installed:
- Python 3.8 or higher - Download Python
- Ollama - Download Ollama
- Git (if cloning from GitHub) - Download Git
- Go to https://ollama.ai/download
- Download Ollama for your operating system (Windows, Mac, or Linux)
- Install Ollama by running the installer
- Verify installation by opening a terminal/command prompt and running:
ollama --version
-
Open a terminal/command prompt
-
Run the following command to download the recommended model:
ollama pull llama3.2:3b
Note: This may take a few minutes depending on your internet connection. The model is approximately 2GB in size.
Alternative models (if you want to use a different one):
ollama pull llama3.1:8b(more accurate, but larger and slower)ollama pull phi3:latest(good alternative)ollama pull mistral:latest(another good option)
Option A: Using Git (Recommended)
git clone https://github.com/vikram0230/Lung-RADS--Extractor.git
cd ollamaOption B: Download as ZIP
- Click the "Code" button on GitHub
- Select "Download ZIP"
- Extract the ZIP file to a folder
- Open a terminal/command prompt in that folder
-
Open a terminal/command prompt in the project folder
-
Install the required Python packages:
pip install -r requirements.txt
If you encounter permission errors, try:
pip install --user -r requirements.txt
-
Create a folder named
datain the project directory (if it doesn't exist) -
Place your CSV file in the
datafolder -
Important: Your CSV file must have a column named
NOTE_CONTENTScontaining the medical notesExample CSV structure:
PATIENT_ID,ENCOUNTER_ID,NOTE_DATE,NOTE_CONTENTS 12345,67890,2022-01-01,"Medical note text here..." 12346,67891,2022-01-02,"Another medical note..."
Basic usage (using default file paths):
python lung_rads_extractor.pyThis will:
- Read from:
data/notes_filtered.csv - Write to:
data/lung_rads_extracted.csv
Custom input/output files:
python lung_rads_extractor.py --input data/my_notes.csv --output data/my_results.csvUsing a different AI model:
python lung_rads_extractor.py --model llama3.1:8bGet help:
python lung_rads_extractor.py --helpThe script will create a new CSV file with all the original columns plus a new column called Lung Rad Score. This column will contain:
- The extracted score (e.g., "4B", "3", "2A", "1") if found
- Empty/blank if no score was found in the note
The script automatically creates log files in the logs/lung_rads_extraction/ folder. Each run creates a new log file with a timestamp. These logs contain detailed information about the extraction process.
| Option | Short | Description | Default |
|---|---|---|---|
--input |
-i |
Input CSV file path | data/notes_filtered.csv |
--output |
-o |
Output CSV file path | data/lung_rads_extracted.csv |
--model |
-m |
Ollama model name | llama3.2:3b |
The script automatically saves progress every 20 rows, so if the process is interrupted, you won't lose all your work. The output file will contain all processed rows up to the point of interruption.
Solution: Make sure Ollama is running. Open a terminal and run:
ollama serveThen run the script again in a different terminal window.
Solution: Download the model by running:
ollama pull llama3.2:3bSolution:
- Check that your input file exists at the specified path
- Make sure you're running the script from the correct directory
- Use the full path:
python lung_rads_extractor.py --input /full/path/to/your/file.csv
Solution:
- Make sure your CSV file has a column named exactly
NOTE_CONTENTS - Check for typos or extra spaces in the column name
- The column name is case-sensitive
Solution: Install the required packages:
pip install -r requirements.txtSolutions:
- Use a smaller model:
--model llama3.2:3b(default, fastest) - Make sure Ollama is running locally (not over network)
- Close other applications to free up system resources
If you encounter issues:
- Check the log files in
logs/lung_rads_extraction/for detailed error messages - Make sure all prerequisites are installed correctly
- Verify your input CSV file format matches the requirements
- Check that Ollama is running and the model is downloaded
This tool uses Ollama for local AI processing and pandas for data handling.