SILVIA is a modular Retrieval-Augmented Generation (RAG) system designed to build domain-specific chatbots using PDF documents as external knowledge sources.
To launch SILVIA:
python SILVIA.pyFollow the CLI instructions.
You can configure the RAG chatbot using:
config/parameters.yaml
By default, this configuration builds a RAG system using the PDF documents stored in:
data/Accurate
Configuration folder containing parameters for a test RAG named "Accurate".
This folder contains all documents and derived artifacts used for prompt augmentation.
At the moment, it includes:
- AI Act documentation, taken from
https://www.aiact-info.eu/full-text-and-pdf-download/
This folder will also contain:
-
Chunk database
- Currently stored as
.json - Format example:
[ { "chunk_id": "AI_act.pdf_p1_c0", "text": "<chunk text>", "metadata": { "file": "AI_act.pdf", "page": 1 } } ]
- Currently stored as
-
FAISS index
faiss_index.idx- Stores vector embeddings for similarity search
Stores outputs produced during pipeline testing.
Includes:
-
Query results
- Retrieved chunks for each query, along with similarity scores
-
Augmented prompts
- Prompts resulting from chunk retrieval and prompt assembly
-
Images
- Histograms of:
- Chunk length distribution for the tested database
- Query distance distribution from retrieved entries
- Histograms of:
Utilities for creating a chunk index from a collection of PDF files.
Includes:
-
pdf_loader.py
Searches for all.pdffiles and loads their content. -
text_cleaner.py
Applies the following cleaning steps:- Replace null characters (
\x00) with spaces - Collapse consecutive spaces and tabs into a single space
- Reduce sequences of 3+ newlines to at most 2
- Strip leading and trailing spaces and newlines
- Return the cleaned text
- Replace null characters (
-
chunker.py
Implements a simple chunking strategy:- Fixed-length chunks
- Fixed overlap between consecutive chunks
Chunking options (defined in the config file):
chunk_sizeoverlap
-
build_index.py
Manages creation of the chunk database (chunks.json)
Utilities for embedding creation and vector database management.
Includes:
-
build_db.py
Creates a FAISS index:data/<RAG_name>/faiss_index.idx- Associated metadata file
-
query_db.py
Implements theFAISSRetrieverclass, which:- Handles similarity queries
- Can store retrieved chunks and scores during testing
Utilities for assembling the final prompt provided to the language model.
Wrappers around Large Language Models.
Currently implemented:
- OpenAI models
General-purpose utility functions.
RAG_nameUnique identifier for the RAG setup.
IMPORTANT
RAG_namemust match the name of the folder insidedata/containing the documents used for prompt augmentation.
layout:true: preserve original PDF layoutfalse: extract text linearly
chunk_size:Approximate number of characters per chunk.
overlap:Number of overlapping characters between consecutive chunks.
model:Supported models:
BAAI/bge-base-en-v1.5all-MiniLM-L6-v2
similarity:Similarity metric:
CosSim→ cosine similarityL2→ Euclidean distance
retrieval_size:Number of chunks retrieved for prompt augmentation.
All prompt fields are inserted verbatim into the final prompt passed to the LLM.
tone:Tone of the assistant’s responses (e.g. formal, friendly, natural).
verbosity:Level of detail in responses (e.g. concise, detailed).
preambolo:System-level instructions prepended to every query.
citation_style:Citation format for retrieved chunks (e.g. [chunk_3]).
api_key:OpenAI API key (required to use online models).
model:LLM model (e.g. gpt-4o-mini, gpt-4.1, gpt-3.5-turbo).
temperature:Controls randomness in generation.
max_tokens:Maximum number of tokens in the output.
top_p:Nucleus sampling parameter.
frequency_penalty:Penalizes repeated tokens.
presence_penalty:Encourages new content.
timeout:Maximum wait time (seconds) for the API response.
verbose:Enable verbose logging for debugging and monitoring.