Skip to content

AlexanderVNikitin/luq

Repository files navigation

Language Models Uncertainty Quantification (LUQ)

MkDocs Open in Colab Pypi version unit-tests Python 3.10+ codecov

Get Started

Install LUQ:

pip install luq

Use LUQ model for UQ

import luq
from luq.models import MaxProbabilityEstimator

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
# Create text generation pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

# sample from LLM
samples = luq.llm.generate_n_samples_and_answer(
    pipeline,
    prompt="A, B, C, or D"
)

mp_estimator = MaxProbabilityEstimator()
print(mp_estimator.estimate_uncertainty(samples))

Tutorials

Open In Colab Introductory Tutorial Getting started with LUQ

Open In Colab Working with LUQ Datasets

Open In Colab Using Predictive Entropy

Uncertainty Quantification Methods

Generally the uncertainty quantification in LUQ sample multiple responses from an LLM and analyse the

Method Class in LUQ Note Reference
Max Probability luq.models.max_probability Estimates uncertainty as one minus the probability of the most likely sequence in the list of samples. -
Top K Gap luq.models.top_k_gap Estimates uncertainty by measuring the gap between the most probable sequence and the k-th most probable one. -
Predictive Entropy luq.models.predictive_entropy Uncertainty is estimated by computing the entropy of probabilities obtained from sampled sequences. https://arxiv.org/pdf/2002.07650
p(true) luq.models.p_true Uncertainty is estimated by computing the entropy of probabilities obtained from sampled sequences. https://arxiv.org/pdf/2002.07650
Semantic Entropy luq.models.semantic_entropy Uncertainty is estimated by performing semantic clustering of LLM responses and calculating the entropy across the clusters. https://arxiv.org/abs/2302.09664
Kernel Language Entropy luq.models.kernel_language_entropy Uncertainty is estimated by performing semantic clustering of LLM responses and calculating the entropy across the clusters. https://arxiv.org/abs/2405.20003

Contributing

Use pre-commit

pip install pre-commit
pre-commit install

Pipeline for dataset creation

Step 1. Create a processed version of a dataset.

mkdir data/coqa
python scripts/process_datasets.py \
    --dataset=coqa \
    --output=data/coqa/processed.json
import json

data = json.load(open("data/coqa/processed.json", "r"))
new_data = {"train": data["train"][:2], "validation": data["validation"][:2]}
json.dump(new_data, open("data/coqa/processed_short.json", "w"))

Step 2. Generate answers from LLMs and augment the dataset with the dataset.

python scripts/add_generations_to_dataset.py \
    --input-file=./data/coqa/processed_short.json\
    --output-file=./data/coqa/processed_gen_short.json\

Step 3. Check accuracy of the answers given

python scripts/eval_accuracy.py \
    --input-file=data/coqa/processed_gen_short.json \
    --output-file=data/coqa/processed_gen_acc_short.json \
    --model-name=gpt2 \
    --model-type=huggingface

Step 4. Upload the dataset to HuggingFace

python scripts/upload_dataset.py \
    --path=data/coqa/processed_gen_acc_short.json \
    --repo-id your-username/dataset-name \
    --token your-huggingface-token

Contributing

We appreciate all contributions. To learn more, please check CONTRIBUTING.md.

Install from sources:

git clone github.com/AlexanderVNikitin/luq
cd luq
pip install -e .

Run tests:

python -m pytest

License

MIT

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •