LabelGenius

A lightweight research toolkit for multimodal classification using GPT (text/image/both) and CLIP (zero-shot or fine-tuned). Built for quick, reproducible experiments and paper appendices.

Quick Links

Try the Colab Demo — Run the full tutorial in your browser
Case Studies — Complete replication scripts in the Case_analysis/ folder
GitHub Repository

What Can You Do?

LabelGenius helps you:

Label with GPT — Classify text, images, or multimodal data using zero-shot or few-shot prompting
Classify with CLIP — Run zero-shot classification or fine-tune a lightweight model
Evaluate results — Get accuracy, F1 scores, and confusion matrices automatically
Estimate costs — Calculate API expenses before running large-scale experiments

Installation

From PyPI (recommended):

pip install labelgenius

Pinned version:

pip install labelgenius==0.1.6

From source:

git clone https://github.com/mediaccs/LabelGenius
cd labelgenius
pip install -e .

Requirements: Python 3.9+, dependencies auto-installed from pyproject.toml. To manually install dependencies, see pyproject.toml for the full list. For GPT features, get an API key from OpenAI.

Core Functions

GPT Classification

Function	Purpose
`classification_GPT`	Text classification (zero-shot or few-shot)
`generate_GPT_finetune_jsonl`	Prepare training data for fine-tuning
`finetune_GPT`	Fine-tune a GPT model

CLIP Classification (if data cannot be shared with business)

Function	Purpose
`classification_CLIP_0_shot`	Zero-shot classification with CLIP
`classification_CLIP_finetuned`	Use a fine-tuned CLIP model
`finetune_CLIP`	Fine-tune CLIP on your dataset

Utilities

Function	Purpose
`auto_verification`	Calculate classification metrics
`price_estimation`	Estimate OpenAI API costs

Quick Start

Import the Library

from labelgenius import (
    classification_GPT,
    classification_CLIP_0_shot,
    finetune_CLIP,
    classification_CLIP_finetuned,
    auto_verification,
    price_estimation,
)

Prepare Your Data

Text only: CSV/Excel/JSON with your text columns
Images only: Directory of images (with optional ID mapping table)
Multimodal: Table with image_id column + images at image_dir/<image_id>.jpg

Using GPT Models

LabelGenius supports both GPT-4.1 and GPT-5 with different control mechanisms:

GPT-4.1 (Temperature-Based)

Control randomness with temperature (0 = deterministic, 2 = creative):

df = classification_GPT(
    text_path="data/headlines.xlsx",
    category=["sports", "politics", "tech"],
    prompt="Classify this headline into one category.",
    column_4_labeling=["headline"],
    model="gpt-4.1",
    temperature=0.7,
    mode="text",
    output_column_name="predicted_category"
)

GPT-5 (Reasoning and Non-Reasoning)

GPT-5 supports both reasoning and non-reasoning modes controlled by reasoning_effort:

df = classification_GPT(
    text_path="data/headlines.xlsx",
    category=["sports", "politics", "tech"],
    prompt="Classify this headline into one category.",
    column_4_labeling=["headline"],
    model="gpt-5",
    reasoning_effort="medium",  # minimal | low | medium | high
    mode="text",
    output_column_name="predicted_category"
)

Reasoning Modes:

"minimal" — Non-reasoning mode (fast, standard responses)
"low" / "medium" / "high" — Reasoning modes (increasing reasoning depth)

Function Reference

`classification_GPT`

Classify text, images, or multimodal data using GPT models.

Parameters:

text_path — Path to your data file (CSV/Excel/JSON)
category — List of possible category labels
prompt — Instruction prompt for classification
column_4_labeling — Column(s) to use as input
model — Model name (e.g., "gpt-4.1", "gpt-5")
temperature — Randomness control for GPT-4.1 (0-2)
reasoning_effort — Reasoning depth for GPT-5 ("minimal", "low", "medium", "high")
mode — Input type: "text", "image", or "both"
output_column_name — Name for prediction column
num_themes — Number of themes for multi-label tasks (default: 1)
num_votes — Number of runs for majority voting (default: 1)

Example (Single-Label Text):

df = classification_GPT(
    text_path="data/articles.csv",
    category=["positive", "negative", "neutral"],
    prompt="Classify the sentiment of this article.",
    column_4_labeling=["title", "content"],
    model="gpt-4.1",
    temperature=0.5,
    mode="text",
    output_column_name="sentiment"
)

Example (Multi-Label with Majority Voting):

prompt = """Label these 5 topics with 0 or 1. Return up to two 1s.
Answer format: [0,0,0,0,0]"""

df = classification_GPT(
    text_path="data/posts.xlsx",
    category=["0", "1"],
    prompt=prompt,
    column_4_labeling=["post_text"],
    model="gpt-5",
    reasoning_effort="low",
    mode="text",
    output_column_name="topic_labels",
    num_themes=5,
    num_votes=3  # Run 3 times and use majority vote
)

Note: You can control the number of labels by adjusting the prompt. For example:

"Return exactly one 1" — Single label
"Return up to two 1s" — Up to 2 labels
"Return up to three 1s" — Up to 3 labels
"Return any number of 1s" — Unrestricted multi-label

Example (Multimodal):

df = classification_GPT(
    text_path="data/products.csv",
    category=["electronics", "clothing", "food"],
    prompt="Classify this product based on text and image.",
    column_4_labeling=["product_name", "description"],
    model="gpt-4.1",
    temperature=0.3,
    mode="both",  # Uses both text and images
    image_dir="data/product_images",
    output_column_name="category"
)

`generate_GPT_finetune_jsonl`

Prepare training data in JSONL format for GPT fine-tuning.

Parameters:

df — DataFrame containing your training data
output_path — Where to save the JSONL file
system_prompt — Instruction prompt for the model
input_col — Column(s) to use as input
label_col — Column(s) containing true labels

Example:

generate_GPT_finetune_jsonl(
    df=training_data,
    output_path="finetune_data.jsonl",
    system_prompt="Classify the sentiment of this review.",
    input_col=["review_text"],
    label_col=["sentiment"]
)

`finetune_GPT`

Launch a GPT fine-tuning job using OpenAI's API.

Parameters:

training_file_path — Path to your JSONL training file
model — Base model to fine-tune (must be a snapshot model, i.e., model name ending with a specific date, e.g., "gpt-4o-mini-2024-07-18")
hyperparameters — Dict with batch_size, learning_rate_multiplier, etc.

Returns: Fine-tuned model identifier

Example:

model_id = finetune_GPT(
    training_file_path="finetune_data.jsonl",
    model="gpt-4o-mini-2024-07-18",
    hyperparameters={
        "batch_size": 8,
        "learning_rate_multiplier": 0.01,
        "n_epochs": 3
    }
)
print(f"Fine-tuned model: {model_id}")

Note: GPT-5 fine-tuning availability varies. Use GPT-4 snapshots if needed.

Using the fine-tuned model:

After fine-tuning completes, you'll receive a model ID via email or in the Python output. Use this ID with classification_GPT by replacing the model parameter:

# Use your fine-tuned model for classification
df = classification_GPT(
    text_path="data/test.csv",
    category=["positive", "negative", "neutral"],
    prompt="Classify the sentiment of this review.",
    column_4_labeling=["review_text"],
    model="ft:gpt-4o-mini-2024-07-18:your-org:model-name:abc123",  # Your fine-tuned model ID
    temperature=0.5,
    mode="text",
    output_column_name="sentiment"
)

`classification_CLIP_0_shot`

Perform zero-shot classification using CLIP without training.

Parameters:

text_path — Path to your data file
img_dir — Directory containing images (optional)
mode — Input type: "text", "image", or "both"
prompt — List of category names/descriptions for zero-shot
text_column — Column(s) to use for text input
predict_column — Name for prediction column

Example (Text Only):

df = classification_CLIP_0_shot(
    text_path="data/articles.csv",
    mode="text",
    prompt=["sports news", "political news", "technology news"],
    text_column=["headline", "summary"],
    predict_column="clip_category"
)

Example (Image Only):

df = classification_CLIP_0_shot(
    text_path="data/image_ids.csv",
    img_dir="data/images",
    mode="image",
    prompt=["a photo of a cat", "a photo of a dog", "a photo of a bird"],
    predict_column="animal_type"
)

Example (Multimodal):

df = classification_CLIP_0_shot(
    text_path="data/posts.csv",
    img_dir="data/post_images",
    mode="both",
    prompt=["advertisement", "personal photo", "news image"],
    text_column=["caption"],
    predict_column="image_category"
)

`finetune_CLIP`

Fine-tune CLIP on your labeled dataset with a small classification head.

Parameters:

mode — Input type: "text", "image", or "both"
text_path — Path to training data
text_column — Column(s) for text input (if using text)
img_dir — Image directory (if using images)
true_label — Column containing true labels
model_name — Filename to save the trained model
num_epochs — Number of training epochs
batch_size — Training batch size
learning_rate — Learning rate for optimizer

Returns: Best validation accuracy achieved

Example:

best_acc = finetune_CLIP(
    mode="both",
    text_path="data/train.csv",
    text_column=["headline", "description"],
    img_dir="data/train_images",
    true_label="category_id",
    model_name="my_clip_model.pth",
    num_epochs=10,
    batch_size=16,
    learning_rate=1e-5
)
print(f"Best validation accuracy: {best_acc:.2%}")

`classification_CLIP_finetuned`

Use your fine-tuned CLIP model to classify new data.

Parameters:

mode — Input type: "text", "image", or "both"
text_path — Path to test data
img_dir — Image directory (if using images)
model_name — Path to your trained model file
text_column — Column(s) for text input
predict_column — Name for prediction column
num_classes — Number of categories in your task

Example:

df = classification_CLIP_finetuned(
    mode="both",
    text_path="data/test.csv",
    img_dir="data/test_images",
    model_name="my_clip_model.pth",
    text_column=["headline", "description"],
    predict_column="predicted_category",
    num_classes=24
)

`auto_verification`

Calculate classification metrics including accuracy, F1 scores, and confusion matrix.

Parameters:

df — DataFrame with predictions and true labels
predicted_cols — List of prediction column names
true_cols — List of true label column names
category — List of possible categories

Example:

auto_verification(
    df=results,
    predicted_cols=["gpt_pred", "clip_pred"],
    true_cols=["true_label", "true_label"],
    category=["0", "1"]
)

Output: Prints accuracy, F1 scores, and displays confusion matrix visualization.

`price_estimation`

Estimate the cost of OpenAI API calls.

Parameters:

response — OpenAI API response object
num_rows — Number of data rows processed
input_cost_per_million — Input token cost (in USD per 1M tokens)
output_cost_per_million — Output token cost (in USD per 1M tokens)
num_votes — Number of API calls per row (for majority voting)

Returns: Estimated total cost in USD

Example:

cost = price_estimation(
    response=api_response,
    num_rows=1000,
    input_cost_per_million=5.0,
    output_cost_per_million=15.0,
    num_votes=3
)
print(f"Estimated total cost: ${cost:.2f}")

Important Notes

Demo Scale:
The Colab demo uses ~20 samples per task for speed. Expect high variance in small-sample results.

Training Data Requirements:
Fine-tuning typically requires hundreds of examples per class. Small datasets (~20 examples) may lead to overfitting or majority-class predictions.

Image Requirements:
For multimodal classification, ensure your image_id column matches filenames in image_dir.

Model Availability:
GPT-5 fine-tuning may not be available for all accounts. Use GPT-4 snapshots as alternatives.

Contributing

We welcome contributions! Feel free to open an issue for bugs or feature requests, or submit a pull request to improve the codebase and documentation.

License

LabelGenius is released under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
CaseStudy		CaseStudy
Demo		Demo
labelgenius		labelgenius
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LabelGenius

Quick Links

What Can You Do?

Installation

Core Functions

GPT Classification

CLIP Classification (if data cannot be shared with business)

Utilities

Quick Start

Import the Library

Prepare Your Data

Using GPT Models

GPT-4.1 (Temperature-Based)

GPT-5 (Reasoning and Non-Reasoning)

Function Reference

`classification_GPT`

`generate_GPT_finetune_jsonl`

`finetune_GPT`

`classification_CLIP_0_shot`

`finetune_CLIP`

`classification_CLIP_finetuned`

`auto_verification`

`price_estimation`

Important Notes

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

mediaccs/LabelGenius

Folders and files

Latest commit

History

Repository files navigation

LabelGenius

Quick Links

What Can You Do?

Installation

Core Functions

GPT Classification

CLIP Classification (if data cannot be shared with business)

Utilities

Quick Start

Import the Library

Prepare Your Data

Using GPT Models

GPT-4.1 (Temperature-Based)

GPT-5 (Reasoning and Non-Reasoning)

Function Reference

classification_GPT

generate_GPT_finetune_jsonl

finetune_GPT

classification_CLIP_0_shot

finetune_CLIP

classification_CLIP_finetuned

auto_verification

price_estimation

Important Notes

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

`classification_GPT`

`generate_GPT_finetune_jsonl`

`finetune_GPT`

`classification_CLIP_0_shot`

`finetune_CLIP`

`classification_CLIP_finetuned`

`auto_verification`

`price_estimation`

Packages