Monitor LLM prompt behavior and detect drift before users notice.
Why Promptinel • Features • Quick Start • Providers • Commands • GitHub Actions •
Promptinel doesn't just measure drift, it explains it.
LLMs change behavior silently. A prompt that worked perfectly last month may produce different output today because a provider updated a model, changed routing, or shifted hidden behavior behind an alias like gpt-4o.
Promptinel is the first open-source drift monitoring tool that actually explains the shift. While traditional evaluation tools focus on raw numeric scores, Promptinel provides actionable, human-readable insights.
- Human-readable drift explanation: "The model is now more verbose and technical than the baseline."
- Silent Model Update detection: Automatically detects when providers update models behind the scenes.
- Agent drift monitoring: Supported via multi-step and tool call tracking (v2).
- Provider agnostic: Works with OpenAI, Anthropic, Mistral, and local Ollama.
- Zero-config mock mode: Test your CI pipeline with simulated drift, no API keys required.
| Explainable Drift | ✓ numeric + LLM explanation | | Silent Model Tracking | ✓ detects silent alias updates | | Agent Monitoring | ✓ multi-step + tool calls (v2) | | Zero-Config | ✓ mock mode / simulated drift | | Open Source | ✓ MIT License / Community first | | Developer Experience | ✓ clone, npm install, run |
Even when your application code has not changed, model behavior can change in ways that impact:
- classification
- extraction
- summarization
- support automation
- routing logic
- safety behavior
- formatting consistency
- agent workflows
- tool calls
- downstream reliability
That means your prompt can quietly regress while your CI still looks green.
Promptinel is built to catch that early.
- snapshots prompt outputs
- creates and manages baselines
- detects semantic drift
- scores drift from
0.0to1.0 - alerts when thresholds are exceeded
- works with real providers and local models
- runs instantly in mock mode with zero config
- fits naturally into CI/CD workflows
- stores everything in simple JSON files
- provides both CLI workflows and a dashboard
Add prompt
↓
Create baseline
↓
Run on schedule
↓
Compare semantically
↓
Alert on drift
The most important design choice in Promptinel is automatic mock mode.
That means:
- anyone can clone the repo and run it immediately
- contributors do not need API keys to test the project
- CI can run without secrets by default
- demos work out of the box
- onboarding friction stays extremely low
That one decision makes Promptinel dramatically more usable as an open source project.
If no provider credentials are found, Promptinel automatically falls back to mock mode.
This lets anyone run the project in seconds with no setup cost.
Promptinel compares outputs semantically instead of relying only on exact text matching.
By default, Promptinel uses an LLM judge to determine whether the current output meaningfully differs from the baseline.
When supported and configured, Promptinel can also use embeddings similarity as an alternative scoring method.
Use mock, local, or hosted models with a simple provider interface.
No database. No migrations. No setup headache.
Get notified when a prompt drifts beyond its configured threshold.
Run scheduled checks in GitHub Actions and other automation environments.
Compare baseline vs current output side by side.
Inspect prompt status, history, snapshots, and drift trends visually.
npm install -g promptinelgit clone https://github.com/diegosantdev/Promptinel.git
cd Promptinel
npm installnode bin/promptinel.js add
node bin/promptinel.js check my-prompt
node bin/promptinel.js baseline my-prompt --latest
node bin/promptinel.js check my-prompt
node bin/promptinel.js diff my-prompt
node bin/promptinel.js reportIf no provider credentials are configured, Promptinel automatically uses mock mode.
node bin/promptinel.js addYou will be guided through:
- prompt id
- messages or prompt text
- provider
- model
- drift threshold
- optional tags
node bin/promptinel.js check classify-intentThis creates a new snapshot.
node bin/promptinel.js baseline classify-intent --latestNow Promptinel knows what expected behavior looks like.
node bin/promptinel.js check classify-intentPromptinel compares the current output against the baseline and calculates drift.
node bin/promptinel.js diff classify-intentnode bin/promptinel.js reportWhen watching your prompts, Promptinel provides clear visual feedback right in your terminal:
📊 extract-entities Provider: [MOCK] Model: mock-default
mock-default-v1.2.2
Current: mock-default-v1.2.3
📝 BEHAVIOR CHANGE: The model has become more verbose and added technical details not present in the baseline.
Drift: 0.450
Each check produces a drift score from 0 to 1.
| Status | Score range |
|---|---|
| Stable | 0.00 – 0.15 |
| Warning | 0.15 – 0.35 |
| Drifted | 0.35+ |
LLM-as-judge
Works across providers and is the default comparison mechanism.
Embeddings similarity
Useful when configured and supported, especially for alternate comparison flows.
Best for demos, CI, testing, and zero-config onboarding.
- no API key required
- deterministic and test-friendly
- simulates drift patterns
- perfect for GitHub Actions
- zero cost
Best for free local testing with real models.
- runs locally
- no API cost
- private by default
- good for experimentation
Supported through OPENAI_API_KEY.
Examples:
gpt-4ogpt-4o-minigpt-3.5-turbo
Embeddings:
text-embedding-3-small
Supported through ANTHROPIC_API_KEY.
Examples:
claude-3-5-sonnetclaude-3-haiku
Supported through MISTRAL_API_KEY.
Examples:
mistral-largemistral-smallopen-mistral-7b
Promptinel can follow a practical fallback strategy:
- configured cloud provider
- local provider like Ollama
- mock mode when no credentials are available
That keeps the repo usable in clean environments and makes onboarding much smoother.
Add a new prompt to the watchlist.
node bin/promptinel.js addRun a prompt once, create a snapshot, and compare to the baseline if one exists.
node bin/promptinel.js check classify-intentRun all prompts on a recurring monitoring pass.
node bin/promptinel.js watchCompare the current snapshot against baseline or recent state.
node bin/promptinel.js diff classify-intentPromote the latest or selected snapshot to baseline.
node bin/promptinel.js baseline classify-intent --latestGenerate a drift report in JSON.
node bin/promptinel.js reportStart the dashboard.
node bin/promptinel.js dashboardDelete old snapshots according to retention policy.
node bin/promptinel.js cleanup{
"id": "summarize-support-ticket",
"provider": "mock",
"model": "mock-v1",
"threshold": 0.28,
"messages": [
{ "role": "system", "content": "You summarize support tickets clearly and concisely." },
{ "role": "user", "content": "Customer says billing failed twice and wants a refund." }
],
"tags": ["support", "summary", "billing"]
}Promptinel uses a simple flat-file structure.
.promptinel/
snapshots/
summarize-support-ticket.json
classify-intent.json
extract-entities.json
history.json
watchlist.json
- zero config
- easy local inspection
- no database dependency
- portable
- easy to ignore in Git
- easy to back up
# OpenAI
OPENAI_API_KEY=your_key_here
# Anthropic
ANTHROPIC_API_KEY=your_key_here
# Mistral
MISTRAL_API_KEY=your_key_here
# Ollama
OLLAMA_BASE_URL=http://localhost:11434
# Slack
SLACK_WEBHOOK_URL=https://hooks.slack.com/services/...
# Optional runtime flags
PROMPTINEL_HTTP_MODE=live
PROMPTINEL_FIXTURES_DIR=.promptinel/fixturesCreate .promptinel/config.json:
{
"defaultProvider": "mock",
"defaultThreshold": 0.3,
"retentionDays": 30
}name: Prompt Drift Monitoring
on:
schedule:
- cron: '0 * * * *'
workflow_dispatch:
jobs:
monitor:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: 18
- name: Install dependencies
run: npm install
- name: Run Promptinel
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: node bin/promptinel.js watch- use mock mode by default
- inject real provider keys only when needed
- keep checks lightweight
- run scheduled drift monitoring separately from unit tests
Promptinel can notify your team when drift exceeds a prompt threshold.
Typical alert payload can include:
- prompt id
- score
- severity
- threshold
- timestamp
- baseline snapshot id
- current snapshot id
- quick diff preview
Catch prompt behavior changes before users do.
Check how moving from one model version to another changes behavior.
Compare variants and validate whether improvements are real.
Maintain prompt history for review, traceability, and reliability checks.
Spot drift before deployment or release.
promptinel/
├── src/
│ ├── cli.js
│ ├── types.js
│ ├── providers/
│ │ ├── mock.js
│ │ ├── ollama.js
│ │ ├── openai.js
│ │ ├── anthropic.js
│ │ └── mistral.js
│ └── services/
│ ├── storage.js
│ ├── watchlist.js
│ ├── scorer.js
│ ├── runner.js
│ └── notifier.js
├── dashboard/
│ ├── app/
│ ├── components/
│ └── lib/
├── bin/
│ └── promptinel.js
├── tests/
│ ├── unit/
│ └── property/
├── docs/
│ ├── GITHUB_ACTIONS.md
│ └── ENVIRONMENT.md
├── .env.example
├── watchlist.json
└── .promptinel/
npm testCoverage:
npm run test:coverageWatch mode:
npm run test:watchSpecific file:
npm test -- tests/unit/services/storage.test.jsgit clone https://github.com/diegosantdev/Promptinel.git
cd Promptinel
npm install
npm test
npm run devContributions are welcome.
- fork the repo
- create a feature branch
- commit your changes
- push the branch
- open a pull request
git checkout -b feature/improve-scoring
git commit -m "Improve drift scoring output"
git push origin feature/improve-scoringMIT © 2026 @diegosantdev
If Promptinel is useful to you:
- star the repo
- open an issue
- suggest features
- share it with teams building with LLMs
Prompt behavior changes silently. Promptinel makes it visible.
Built and maintained by @diegosantdev

