Lung RADS Score Extractor

This tool automatically extracts Lung RADS (Lung Imaging Reporting and Data System) scores from medical notes using AI. It reads a CSV file containing medical notes and outputs a CSV file with the extracted scores.

📋 Prerequisites

Before you begin, make sure you have the following installed:

Python 3.8 or higher - Download Python
Ollama - Download Ollama
Git (if cloning from GitHub) - Download Git

🚀 Step-by-Step Setup Instructions

Step 1: Install Ollama

Go to https://ollama.ai/download
Download Ollama for your operating system (Windows, Mac, or Linux)
Install Ollama by running the installer
Verify installation by opening a terminal/command prompt and running:
```
ollama --version
```

Step 2: Download the Required AI Model

Open a terminal/command prompt
Run the following command to download the recommended model:
```
ollama pull llama3.2:3b
```
Note: This may take a few minutes depending on your internet connection. The model is approximately 2GB in size.

Alternative models (if you want to use a different one):
- ollama pull llama3.1:8b (more accurate, but larger and slower)
- ollama pull phi3:latest (good alternative)
- ollama pull mistral:latest (another good option)

Step 3: Clone or Download This Repository

Option A: Using Git (Recommended)

git clone https://github.com/vikram0230/Lung-RADS--Extractor.git
cd ollama

Option B: Download as ZIP

Click the "Code" button on GitHub
Select "Download ZIP"
Extract the ZIP file to a folder
Open a terminal/command prompt in that folder

Step 4: Install Python Dependencies

Open a terminal/command prompt in the project folder
Install the required Python packages:
```
pip install -r requirements.txt
```
If you encounter permission errors, try:
```
pip install --user -r requirements.txt
```

Step 5: Prepare Your Input File

Create a folder named data in the project directory (if it doesn't exist)
Place your CSV file in the data folder

Important: Your CSV file must have a column named NOTE_CONTENTS containing the medical notes

Example CSV structure:

PATIENT_ID,ENCOUNTER_ID,NOTE_DATE,NOTE_CONTENTS
12345,67890,2022-01-01,"Medical note text here..."
12346,67891,2022-01-02,"Another medical note..."

Step 6: Run the Script

Basic usage (using default file paths):

python lung_rads_extractor.py

This will:

Read from: data/notes_filtered.csv
Write to: data/lung_rads_extracted.csv

Custom input/output files:

python lung_rads_extractor.py --input data/my_notes.csv --output data/my_results.csv

Using a different AI model:

python lung_rads_extractor.py --model llama3.1:8b

Get help:

python lung_rads_extractor.py --help

📊 Understanding the Output

The script will create a new CSV file with all the original columns plus a new column called Lung Rad Score. This column will contain:

The extracted score (e.g., "4B", "3", "2A", "1") if found
Empty/blank if no score was found in the note

📝 Log Files

The script automatically creates log files in the logs/lung_rads_extraction/ folder. Each run creates a new log file with a timestamp. These logs contain detailed information about the extraction process.

⚙️ Command-Line Options

Option	Short	Description	Default
`--input`	`-i`	Input CSV file path	`data/notes_filtered.csv`
`--output`	`-o`	Output CSV file path	`data/lung_rads_extracted.csv`
`--model`	`-m`	Ollama model name	`llama3.2:3b`

🔄 Progress Saving

The script automatically saves progress every 20 rows, so if the process is interrupted, you won't lose all your work. The output file will contain all processed rows up to the point of interruption.

❓ Troubleshooting

Problem: "Cannot connect to Ollama"

Solution: Make sure Ollama is running. Open a terminal and run:

ollama serve

Then run the script again in a different terminal window.

Problem: "Model 'llama3.2:3b' not found"

Solution: Download the model by running:

ollama pull llama3.2:3b

Problem: "Input file not found"

Solution:

Check that your input file exists at the specified path
Make sure you're running the script from the correct directory
Use the full path: python lung_rads_extractor.py --input /full/path/to/your/file.csv

Problem: "Required column 'NOTE_CONTENTS' not found"

Solution:

Make sure your CSV file has a column named exactly NOTE_CONTENTS
Check for typos or extra spaces in the column name
The column name is case-sensitive

Problem: "ModuleNotFoundError: No module named 'ollama'"

Solution: Install the required packages:

pip install -r requirements.txt

Problem: Script runs very slowly

Solutions:

Use a smaller model: --model llama3.2:3b (default, fastest)
Make sure Ollama is running locally (not over network)
Close other applications to free up system resources

📞 Getting Help

If you encounter issues:

Check the log files in logs/lung_rads_extraction/ for detailed error messages
Make sure all prerequisites are installed correctly
Verify your input CSV file format matches the requirements
Check that Ollama is running and the model is downloaded

🙏 Acknowledgments

This tool uses Ollama for local AI processing and pandas for data handling.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
lung_rads_extractor.py		lung_rads_extractor.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lung RADS Score Extractor

📋 Prerequisites

🚀 Step-by-Step Setup Instructions

Step 1: Install Ollama

Step 2: Download the Required AI Model

Step 3: Clone or Download This Repository

Step 4: Install Python Dependencies

Step 5: Prepare Your Input File

Step 6: Run the Script

📊 Understanding the Output

📝 Log Files

⚙️ Command-Line Options

🔄 Progress Saving

❓ Troubleshooting

Problem: "Cannot connect to Ollama"

Problem: "Model 'llama3.2:3b' not found"

Problem: "Input file not found"

Problem: "Required column 'NOTE_CONTENTS' not found"

Problem: "ModuleNotFoundError: No module named 'ollama'"

Problem: Script runs very slowly

📞 Getting Help

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

vikram0230/Lung-RADS--Extractor

Folders and files

Latest commit

History

Repository files navigation

Lung RADS Score Extractor

📋 Prerequisites

🚀 Step-by-Step Setup Instructions

Step 1: Install Ollama

Step 2: Download the Required AI Model

Step 3: Clone or Download This Repository

Step 4: Install Python Dependencies

Step 5: Prepare Your Input File

Step 6: Run the Script

📊 Understanding the Output

📝 Log Files

⚙️ Command-Line Options

🔄 Progress Saving

❓ Troubleshooting

Problem: "Cannot connect to Ollama"

Problem: "Model 'llama3.2:3b' not found"

Problem: "Input file not found"

Problem: "Required column 'NOTE_CONTENTS' not found"

Problem: "ModuleNotFoundError: No module named 'ollama'"

Problem: Script runs very slowly

📞 Getting Help

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages