Claim Based Extraction

This repository implements an end-to-end pipeline for claim extraction and triplet generation from speech transcripts, including model training, inference, and experiment management.

📁 Project Structure

claim-analysis/
├── README.md               # Project overview and setup
├── requirements.txt        # Python dependencies
├── docker-compose.yml      # Compose file for CPU based work
├── docker-compose-gpu.yml  # Compose file for GPU based work
├── Dockerfile              # Docker file for experiment runner
├── run.sh                  # Autodetector for upping background utilities
├── .gitignore              # Ignore data, outputs, checkpoints
│
├── db/                     # Submodule: https://github.com/Fonzzy1/federal-hansard-db
├── data/                   # Pipeline data
│   ├── clean.py            # Script for preparing raw texts
│   ├── annotator.py        # Script for adding Gold Standard Labels
│   └── annotations/        # Gold-standard labels (small files, versioned)
│       ├── train/          # Training set (80% of labels)
│       └── test/           # Test set (20% of labels)
│
├── models/                 # Reusable ML code
│   ├── filter/             # Classes for filter
│   ├── extraction/         # Classes for extraction models
│   ├── deconstruction/     # Classes for deconstruction models
│   └── base_model.py       # Universal model for all of the classes
│
├── pipelines/              # Orchestrator scripts
│   ├── filter/             # Train / inferance for filter
│   ├── extraction/         # Train / inferance for extraction
│   ├── deconstruction/     # Train / inferance for deconstruction
│   └── inferance.py        # Full inference across the dataset with eval
│
├── experiments/            # Training & analysis experiments
│   ├── configs/            # YAML configs for reproducibility
│   └── results/            # Each experiment gets a unique folder
│
├── analysis/               # Notebooks and scripts for inspection/visualization
│
└── dashboard/              # Additional GUI browsing tool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claim Based Extraction

📁 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
analysis/filter		analysis/filter
dashboard		dashboard
data		data
db @ c1265f2		db @ c1265f2
experiments		experiments
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
.overseer.lua		.overseer.lua
Dockerfile		Dockerfile
README.md		README.md
__init__.py		__init__.py
docker-compose-db-overide.yml		docker-compose-db-overide.yml
docker-compose-gpu.yml		docker-compose-gpu.yml
docker-compose.yml		docker-compose.yml
pipeline.py		pipeline.py
requirements.txt		requirements.txt
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

Claim Based Extraction

📁 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages