Skip to content

Lifto/FedoraDocsRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fedora Docs RAG Database

A pre-built RAG (Retrieval-Augmented Generation) database of Fedora documentation, ready for use with local AI assistants.

What is this?

This repository provides a database dump containing vectorized Fedora documentation, suitable for semantic search and RAG-powered Q&A. Built using docs2db.

Features

  • 🚀 Ready to use - Download the dump, restore it, and start querying
  • 📚 Comprehensive - Includes Quick Docs, Sysadmin Guide, CoreOS, Silverblue, and more
  • 🔄 Regularly updated - Rebuilt when upstream documentation changes
  • 🔓 Open source - Same license as Fedora documentation

Quick Start

1. Download the database dump

# Download the latest release
curl -LO https://github.com/Lifto/FedoraDocsRAG/releases/latest/download/fedora-docs.sql

2. Restore and query

# Restore the dump (starts PostgreSQL via Podman automatically)
uvx docs2db db-start
uvx docs2db db-restore fedora-docs.sql

# Query the database
uvx docs2db-api query "How do I install packages on Fedora?"

Building from Source

If you want to build the database yourself:

Prerequisites

  • Python 3.12
  • uv
  • Docker or Podman
  • Git

Build

# Clone this repository
git clone https://github.com/Lifto/FedoraDocsRAG.git
cd FedoraDocsRAG

# Install dependencies and build
uv sync
uv run python build.py

The build script will:

  1. Clone all Fedora documentation repositories
  2. Build them with Antora (in a container)
  3. Ingest, chunk, and embed using docs2db
  4. Create a database dump in dist/fedora-docs.sql

Documentation Sources

This database includes documentation from:

Source Description
Quick Docs Common tasks and tutorials
Sysadmin Guide Server administration
Release Notes Version-specific changes
CoreOS Container-focused OS
Silverblue Immutable desktop
IoT Internet of Things
And more... See build.py for full list

License

Database Content (CC-BY-SA 4.0)

The database dump containing Fedora documentation is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

This is a derivative work of Fedora Documentation, which is licensed under CC-BY-SA by the Fedora Project.

Build Scripts (Apache 2.0)

The build scripts and tooling in this repository are licensed under the Apache License 2.0.

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request

Related Projects

About

Build a RAG database from Fedora documentation

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages