This project demonstrates a simple, extensible pipeline for LLM-based classification using OpenRouter (via LiteLLM) and DSPy.
© 2025 auura inc. All rights reserved.
- Intro Example: Loads a CSV dataset of simple math addition questions (e.g., "What is 2 + 3?") and uses an LLM to answer and evaluate them. This serves as a minimal working example for LLM-based classification.
- Extensible Design: The same pipeline can be adapted to classify any other content, mapping inputs to classes or labels using LLMs.
- Uses OpenRouter LLMs (via LiteLLM) to answer/classify each input.
- Compares the model's answer to the reference/class label in the CSV.
- Prints results and calculates accuracy.
data/: Contains datasets, e.g.,math_addition_questions.csv(math example),html_content_classification.csv(HTML classification),md_content_classification.csv(Markdown classification), etc.src/main.py: Loads the dataset, queries the LLM, and evaluates results. Supports math, website, HTML, and Markdown classification.src/generate_math_dataset.py: Script to generate a CSV of random math addition questions (intro example).src/generate_html_content_classification_csv.py: Converts HTML.txtfiles indata/html/to a classification CSV for LLMs.src/html2md.py: Converts HTML.txtfiles in a directory to Markdown.mdfiles in another directory.src/generate_md_content_classification_csv.py: Converts Markdown.mdfiles indata/md/to a classification CSV for LLMs.
- Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Create a
.envfile in the project root with your OpenRouter API key:OPENROUTER_API_KEY=your_api_key_here
-
Build and run the app using Docker Compose:
docker-compose up --build
This will build the image and run the main script. To run a different task, edit the
command:indocker-compose.ymlor override it:docker-compose run --rm dspy-app python3 src/main.py --task math
-
Pass your OpenRouter API key as an environment variable (recommended):
- Create a
.envfile in the project root with:OPENROUTER_API_KEY=your_api_key_here - Or export it in your shell before running Docker Compose:
export OPENROUTER_API_KEY=your_api_key_here docker-compose up
- Create a
-
(Optional) Generate the math addition dataset (intro example):
python src/generate_math_dataset.py
-
Run the main script to answer/classify and evaluate:
python src/main.py
By default, this will run the math example. To classify other types of data you cann pass a different argument in the parser when calling the script.
The script will print each question/input, the model's answer/classification, the reference/class label, and whether it was correct. At the end, it prints the overall accuracy.
-
(Optional) Generate HTML or Markdown classification datasets:
- To convert HTML
.txtfiles to a CSV:python src/generate_html_content_classification_csv.py
- To convert HTML
.txtfiles to Markdown:python src/html2md.py html md
- To convert Markdown
.mdfiles to a CSV:python src/generate_md_content_classification_csv.py
- To convert HTML
-
(Optional) Count the number of rows in the Markdown classification CSV:
python src/count_md_rows.py
- To use a different model, edit the
modelargument inask_openrouter()insrc/main.pyor themodel_nameparameter. - To use your own dataset, place a CSV in the
data/folder with columns:question,answer, and optionallycontext. - To classify HTML or Markdown content, use the provided scripts to generate the appropriate CSVs, then run the main script with the corresponding function.
Q: What is 2 + 3?
OpenRouter Answer: 5
Reference: 5
Correct: True
---
...
Accuracy: 9/10 = 0.90
- Add more complex question types or classification datasets
- Integrate DSPy for prompt optimization or advanced pipelines.
- Implement more sophisticated evaluation metrics.