Skip to content

[FEAT]: Automated External Dataset (RAG) connection for instant lookup of protocols and rules #312

@Cubix33

Description

@Cubix33

📝 Description

Currently, the AI only knows what is spoken in the audio transcript. This feature gives the AI an "open-book test" capability using Retrieval-Augmented Generation (RAG).

If a transcript mentions a specific situation (like a "Class 3 Chemical Leak"), the AI will:

  1. Automatically search local fire department manuals or HAZMAT guidelines.
  2. Retrieve the exact safety protocol.
  3. Attach that official protocol directly to the final report's notes.

💡 Rationale

  • Instant Expert Knowledge: First responders cannot memorize every single hazardous material code or standard operating procedure (SOP). FireForm will act as an intelligent incident partner, not just a form filler.
  • Zero Hallucinations: Grounding the AI in factual, local documents prevents it from making up fake safety rules.
  • Infinite Scalability: Any fire department can customize FireForm just by dropping their own PDF manuals into the database.

🛠️ Proposed solution

I will implement a lightweight RAG pipeline to give the LLM access to external knowledge.

  • Database Setup: Integrate a local vector database (like ChromaDB or FAISS) to store embeddings of safety manuals.
  • Context Retrieval (src/llm.py): Add a pre-processing step to scan the transcript, search the vector database, and grab the most relevant chunks of text.
  • Prompt Injection (src/llm.py): Update the build_prompt function to feed this new "Official Context" directly to the Mistral model so it can supplement its answers.

✅ Acceptance Criteria

  • Ingestion: The system successfully converts a sample PDF manual into searchable vector embeddings.
  • Retrieval: If the audio transcript mentions "Ammonia Spill," the final JSON output successfully includes the Ammonia handling protocol from the manual.
  • Stability: The new vector database runs perfectly inside the existing Docker container without breaking the build.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions