🐝 BeesUp-LLM-Framework

Codebase for LLM-based tools in the BeesUp project

✨ Key Features

LLMPipeline: An easy-to-use pipeline for loading models, interacting with them, and streaming tokens in real time.
EMBPipeline: Same thing for embedding models.
RAGPipeline: Enables Retrieval-Augmented Generation for enhanced context-aware answers.
FinetuningExperiment: A streamlined setup for parameter-efficient fine-tuning of quantized LLMs on medium-sized GPUs (48 GB), producing reusable LoRA adapters.

📊 Structured Data Extraction from Wildbee Observation Reports

Custom evaluation metrics that compare LLM-generated JSON outputs to ground-truth labels using fuzzy string matching, converting results into standard precision, recall, and F1 scores, and finally calculating an overall extraction score (S_extract).
ExtractionPipeline: Takes one or multiple report passages as input and returns structured data on the wildbee observations. If labeled data is provided, automatic evaluation is included.
ExtractionExperiment: Handles fine-tuning and evaluation of an LLM on labeled extraction datasets.

💉 Structure-Aware Knowledge Injection via LLM Fine-Tuning

Taxomizer: Automatically organizes unstructured knowledge chunks into a clear taxonomy (table of contents or tree structure) using hierarchical clustering and smart tree-cutting methods, while generating meaningful headers for each cluster.
Evaluator classes MCQEvaluator, QDQEvaluator, and FFQEvaluator: Used to assess an LLM’s knowledge and factual accuracy before and after knowledge injection.
InjectionExperiment: Injects knowledge into the LLM and tracks improvements over epochs with detailed evaluations.

🧩 Dependencies & Prequesites

🐍 Python Environment: Specified in provided beesup_environment.yaml, Note there's a second environment beesup_nvembed_environment.yaml neccessary to run nvidia/NV-Embed-v2. Experiments will not work in this environment.
🐳 Docker Image: shafi.tu-ilmenau.de:30500/project-conda:latest

📦 Installation

Clone this repository:

git clone https://gitlab.tu-ilmenau.de/mase4201/beesup-llm-framework.git

Set up the Conda environment:

conda env create -f beesup_environment.yaml
conda activate beesup

Install:

cd beesup_llm
pip install -e .

Name		Name	Last commit message	Last commit date
Latest commit History 263 Commits
beesup_llm		beesup_llm
.gitignore		.gitignore
README.md		README.md
beesup_cpu_environment.yaml		beesup_cpu_environment.yaml
beesup_environment.yaml		beesup_environment.yaml
beesup_nvembed_environment.yaml		beesup_nvembed_environment.yaml
rag_notebook.ipynb		rag_notebook.ipynb
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐝 BeesUp-LLM-Framework

✨ Key Features

📊 Structured Data Extraction from Wildbee Observation Reports

💉 Structure-Aware Knowledge Injection via LLM Fine-Tuning

🧩 Dependencies & Prequesites

📦 Installation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

fee1ix/beesup_llm

Folders and files

Latest commit

History

Repository files navigation

🐝 BeesUp-LLM-Framework

✨ Key Features

📊 Structured Data Extraction from Wildbee Observation Reports

💉 Structure-Aware Knowledge Injection via LLM Fine-Tuning

🧩 Dependencies & Prequesites

📦 Installation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages