TestiFYI 🧬

Your AI-Powered Evidence Copilot

📌 What is TestiFYI?

Imagine being an officer juggling body-cam footage, audio clips, witness statements, and scribbled reports.......all while the clock’s ticking on paperwork and investigations. It’s chaotic, exhausting, and inefficient.

TestiFYI steps in as your AI-powered case co-pilot. It’s a multi-source investigation assistant built to:
🎯 To make sense of scattered evidence and help law enforcement focus more on the field, not the filing cabinet.

TestiFYI lets you upload videos, audios, text notes, and even scanned images — and intelligently processes them to:

🎧 Transcribe what was said.
🗣️ Detect who said it.
👀 Extract key details.
🤖 Run powerful LLM-based analysis to flag contradictions or gaps like "Accoridng to the informer the suspect is wearing a blue hoodie, but the person arrested is wearing ablack one"
💬 Let officers query the case in natural language like “Where did the robbery happen?”

This project draws inspiration from the incredible work by the folks at Code Four....... a groundbreaking platform that’s revolutionizing how cops handle digital evidence. Their tool auto-generates reports, summarizes videos in real-time, and reduces desk-time for officers by up to 60%. If you haven’t seen their launch post, check it out here. It’s 🔥.

TestiFYI is a smaller, simplified take on what CodeFour is building, with a focus on linking insights across multiple evidence sources and enabling case-level reasoning through LLMs.......all built using open source and free-tier resources! ✌️

🛠 Tech Stack

Area	Tool / Library
Frontend UI	`streamlit`
Audio Transcribe	`whisper`
Diarization	`pyannote.audio`, `torchaudio`
OCR	`pytesseract`, `Pillow`
PDF/Text Parsing	`PyMuPDF` (fitz)
Video Processing	`moviepy`, `pydub`
LLM Integration	`Groq API` with `llama-3.3-70b`
Backend Utils	Python (`json`, `os`, `datetime`)

🚀 Features

👮 Upload body-cam video, audio recordings, scanned images, or text files.
🧾 Automatic transcription using Whisper & speaker diarization with pyannote.audio.
🖼️ Perform OCR on scanned documents like police reports or handwritten notes.
📦 Generate a structured JSON aligning speech, speakers, and source metadata.
🧠 LLM-powered analysis to uncover inconsistencies or red flags across evidence.
💬 Ask natural-language questions and get answers based only on uploaded sources.
🎛️ All features accessible via an interactive Streamlit-based UI.

📁 Repository Structure

app.py — Streamlit-based frontend for uploading, analyzing, and querying evidence.
requirements.txt — All Python dependencies.
prompts/ — 📄 Prompt templates for LLM interactions
src/ — 🔧 Core utility logic
- diarize.py — Speaker diarization with pyannote.audio
- json_formatter.py — Aligns and saves structured JSON from segments
- llm_utils.py — Handles Groq LLM interactions with templates
- ocr_utils.py — Extracts text from scanned images using OCR
- parse_texts.py — Extracts raw text from .txt and .pdf files
- transcribe.py — Audio transcription using Whisper
- video_to_audio.py — Converts video files to WAV audio for transcription

Flow 🌱

⚙️ Getting Started

1. 🚀 Clone the Repository

git clone https://github.com/Keerthanareddy17/TestiFYI.git
cd testify

2. 🧪 Create a Virtual Environment

python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

3. 📦 Install Dependencies

pip install -r requirements.txt

4. 🔑 Set Environment Variables

Create a .env file at the project root and add your Hugging Face Token. You can get one here:
```
HUGGINGFACE_TOKEN=your_huggingface_token_here
```
Set the Groq API key securely in your .env file :
```
GROQ_API_KEY=your_groq_api_key_here
```

5. 🖥️ Run the Streamlit App

streamlit run app.py

🌟 Interface Overview

📂 Upload Evidence Tab

Upload any combination of supported evidence files:

Body-cam videos (.mp4, .avi, etc.)
Audio clips (.wav, .mp3)
Text documents (.txt)
Scanned images (.png, .jpg, .jpeg)
Uploading all types is not mandatory. A single file is enough to proceed.

🧠 Explore & Investigate Tab

Click 🔍 Run Evidence Analysis
Let the LLM highlight inconsistencies, contradictions, or suspicious patterns across your evidence.
Ask natural language case questions like:
- “What did the witness say about the hoodie?”
- “Where was the robbery reported to happen?”

⚠️ All responses are generated solely based on your uploaded data.

📸 Snapshots of TestiFYI

feel free to reach out! 🤝

I'm always up for talking tech, feedback, collaborations, weird LLM behavior stories 😮‍💨

📬 Email: katasanikeerthanareddy@gmail.com
🌐 LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
prompts		prompts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TestiFYI 🧬

📌 What is TestiFYI?

🛠 Tech Stack

🚀 Features

📁 Repository Structure

Flow 🌱

⚙️ Getting Started

1. 🚀 Clone the Repository

2. 🧪 Create a Virtual Environment

3. 📦 Install Dependencies

4. 🔑 Set Environment Variables

5. 🖥️ Run the Streamlit App

🌟 Interface Overview

📂 Upload Evidence Tab

🧠 Explore & Investigate Tab

📸 Snapshots of TestiFYI

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TestiFYI 🧬

📌 What is TestiFYI?

🛠 Tech Stack

🚀 Features

📁 Repository Structure

Flow 🌱

⚙️ Getting Started

1. 🚀 Clone the Repository

2. 🧪 Create a Virtual Environment

3. 📦 Install Dependencies

4. 🔑 Set Environment Variables

5. 🖥️ Run the Streamlit App

🌟 Interface Overview

📂 Upload Evidence Tab

🧠 Explore & Investigate Tab

📸 Snapshots of TestiFYI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages