Hackathon Category: Javascript/Typescript Demo Video (≤ 5 min): Watch on YouTube
ModelProof is a real-time, agent-driven AI chat system that cross-validates responses from multiple large language models and performs safety audits on each reply, flagging hallucinations, bias, toxicity, and misalignment. Built with modular agents, it delivers trustworthy AI interactions out of the box.
- Multi-Model Validation
Query two LLMs in parallel (Meta-Llama-3 & AI21-Jamba-1.5), compare confidence and similarity, and automatically select the best answer or present both when divergent. - Risk Analysis by Response
Audit every response for hallucination, bias, toxicity, and intent alignment using the RiskAuditorAgent. - Real-Time Audit Panel
Color-coded ✅/⚠️ /❌ indicators with scores and detailed explanations, all updated live as tokens stream in. - Agentic Framework
Clear separation of concerns:- CrossModelRAGAgent handles multi-model queries and validation
- RiskAuditorAgent performs safety auditing
- Coordinator orchestrates fallbacks, retries, and UI integration
- Built-In Fallbacks
Automatically retry with alternate providers (GitHub AI → HuggingFace Gradio) on rate limits or failures. - Extensible & Configurable
Easily swap in new models or add custom audit agents via a unifiedModelResponseinterface.
sequenceDiagram
autonumber
participant CW as ChatWindow.tsx
participant AP as AuditPanel.tsx
participant OR as Agent Coordinator
participant AU as RiskAuditorAgent
participant RAG as CrossModelRAGAgent
participant MAI as Meta-Llama-3-Instruct
participant JMB as AI21-Jamba-1.5
participant PHI as phi-3-mini-128k
participant AZ as Azure AI Inference Client
CW->>OR: prompt
OR->>AU: audit request
OR->>RAG: forward prompt
RAG->>MAI: primary
RAG->>JMB: secondary
MAI->>AZ: inference call
JMB->>AZ: inference call
RAG->>CW: response
AU->>PHI: audit
PHI->>AZ: inference call
AZ->>AU: phi result
AU->>AP: display audit result
- Frontend: React, TypeScript, Tailwind CSS
- Backend: Node.js, TypeScript
- AI Inference: Azure AI Client (Streaming), HuggingFace Gradio Client
- CI/CD: GitHub Actions
- APIs: GitHub AI, HuggingFace
- Build Tools: Vite, npm
- Diagramming: Mermaid
-
Clone the repository:
git clone https://github.com/hgenix20/modelproof.git cd modelproof -
Install dependencies:
npm install
-
Create a
.envfile in the root directory:VITE_GITHUB_TOKEN=my_github_api_token VITE_API_ENDPOINT=https://models.github.ai/inference VITE_HUGGINGFACE_TOKEN=my_huggingface_api_token
-
Run locally:
npm run dev
-
Build for production:
npm run build npm run preview
The system can be configured through environment variables and the config object:
export const config = {
similarityThreshold: 0.8,
maxRetries: 3,
models: {
MAI: {
github: "meta/Meta-Llama-3-8B-Instruct",
huggingface: "meta-llama/Meta-Llama-3-8B-Instruct"
},
JMB: {
github: "ai21-labs/AI21-Jamba-1.5-Large",
huggingface: "ai21-labs/AI21-Jamba-1.5-Large"
},
PHI: {
github: "microsoft/phi-3-mini-128k-instruct",
huggingface: "microsoft/phi-3-mini-128k-instruct"
}
}
};The system evaluates four key metrics:
- Hallucination Score: Measures factual accuracy and confidence
- Bias Score: Identifies potential biases and stereotypes
- Toxicity Score: Assesses harmful or inappropriate content
- Intent Alignment Score: Measures how well responses align with user intent
- Never commit API tokens or secrets to version control
- Use environment variables for all sensitive configuration
- Rotate API tokens regularly
- Keep dependencies updated to patch security vulnerabilities
- Review the
.gitignorefile to ensure sensitive files are excluded
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Meta for Llama models
- AI21 Labs for Jamba models
- Microsoft for Phi models & Azure AI Inference
- The open-source community for various tools and libraries
For questions or support, please open an issue in the repository.
Built for the AI Agents Hackathon 2025

