Your tests passed. Your PR was approved. Your change still broke production.
Tests confirm existing behavior. Code review confirms intent.
Neither validates what your change actually does.
GauntletCI detects Behavioral Change Risk in pull request diffs, identifying logic shifts, missing validations, and hidden regressions that pass tests and code review.
GauntletCI is a pre-commit, diff-first change-risk detection tool.
It analyzes what changed in your code and flags unverified behavioral changes before they reach code review.
- β‘ Sub-second analysis β no compilation, no AST, no network
- π Runs locally β no code leaves your machine
- π― High-signal output β designed to surface 0β3 findings per run
It answers one question:
Did this change introduce behavior that is not properly validated?
GauntletCI detects Behavioral Change Risk: unverified behavior changes introduced by a diff.
- Install the tool
- Run it on your current changes
- See 0β3 high-signal findings (or none)
No setup required.
Want to see GauntletCI catch real bugs in real PRs before installing anything?
The GauntletCI-Demo repo is a realistic ASP.NET Core OrderService with 6 always-open scenario PRs. Each PR makes a plausible multi-file change with a single risky line buried inside. GauntletCI runs on every PR β open one and read the workflow output:
| PR | Scenario | Expected verdict |
|---|---|---|
| 01 | Safe typo fix | β clean β no findings |
| 02 | Silent catch { } around payment call |
β GCI0007 Error Handling Integrity |
| 03 | Hardcoded API key in Program.cs |
β GCI0012 Secret Hygiene |
| 04 | CancellationToken dropped from IPaymentClient |
β GCI0004 Public API Contract |
| 05 | Customer email logged in LogInformation |
β GCI0029 PII Logging Leak |
| 06 | Static counter mutated without sync | β GCI0016 Concurrency Safety |
Want to drive it yourself? Fork or clone GauntletCI-Demo and run the scenarios on your own copy β the demo repo's README has a one-click fork-and-run path plus a local-CLI walkthrough.
Tests and code review do not reliably validate behavioral changes.
Even experienced developers miss things in diffs.
Not because they lack skill β but because diffs are deceptive.
A small change can silently alter behavior:
- A null check changes execution flow
- A guard clause introduces new exceptions
- A method signature changes without test updates
- A dependency call is modified without validation
- A conditional branch shifts logic
These are not syntax errors. They are behavior changes β and they regularly slip through code review.
Modern development pipelines have strong tooling, but each layer answers a different question:
- Static analysis checks code quality
- Security tools check vulnerabilities
- Tests verify expected behavior
- Code review checks intent
None of them validate the behavioral impact of a change.
GauntletCI introduces a new layer: Behavioral Change Risk detection
It focuses only on the delta between versions and asks:
Is this change safe?
A single line was removed from a production service:
public async Task<Order> CreateOrderAsync(CreateOrderRequest request)
{
- if (request is null) throw new ArgumentNullException(nameof(request));
var order = new Order(request.CustomerId, request.Items);
return await _repo.SaveAsync(order);
}- 1 line removed
- Tests passed
- PR approved ("cleaned up redundant null check")
Callers relying on the early ArgumentNullException now receive a NullReferenceException
deeper in the call stack, with no context. The change shipped.
GauntletCI flagged it before the commit was created:
[High] GCI0003: Guard clause removed at line 3. ArgumentNullException no
longer thrown on null input. Callers relying on this contract will see
NullReferenceException deeper in the call stack.
This is Behavioral Change Risk: a change that compiles, passes tests, and passes review -- but alters runtime behavior in a way none of those checks can see.
GauntletCI rules have been validated against real-world pull requests:
| Project | What GauntletCI Caught |
|---|---|
| dotnet/efcore | O(nΒ²) performance risk (LINQ in loops) |
| StackExchange.Redis | Context mutation in property getter |
| Dapper | Null-forgiving operator misuse |
| SharpCompress | Numeric overflow risk |
| AngleSharp | Enum member removal breaking serialization |
dotnet tool install -g GauntletCI
# Run before committing
gauntletci analyze --stagedRunning against StackExchange.Redis PR#2995 - GauntletCI flags a swallowed exception in production connection handling. GIF recorded with ScreenToGif (open source)
Typical output includes 0-3 high-signal findings.
GauntletCI avoids noise by design:
- Diff-only analysis (only what changed)
- No style or formatting checks
- Focused on behavioral risk only
- Baseline suppression for legacy code
Introduce GauntletCI into any codebase without noise:
gauntletci baseline create
gauntletci analyze --stagedOnly new risks introduced by the current change are shown.
- Behavior changes without tests
- API and serialization changes
- Numeric truncation / overflow risks
- State mutation issues
- Blocking async calls
- Disposable leaks
- SQL injection risks
- Hardcoded secrets
- PII exposure (auto-redacted)
- Missing logging
- Silent failures
GauntletCI includes 35 built-in detection rules across:
- Behavior & Contracts
- Security
- Data Integrity
- Async & Concurrency
- Observability
- Architecture
- Test Quality
Rule IDs range from GCI0001-GCI0050. Rule IDs are non-contiguous because the rule set evolved over time: some early rules were retired, merged, or replaced as the engine matured. The gaps reflect that history. Existing rule IDs are never renumbered so that baseline fingerprints and suppression annotations remain stable across upgrades.
Start in advisory mode first so your team can review findings before blocking merges.
Create .github/workflows/gauntletci.yml:
name: GauntletCI
on:
pull_request:
permissions:
contents: read
pull-requests: write
jobs:
risk-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- uses: EricCogen/GauntletCI@main
with:
fail-on-findings: "false"
inline-comments: "true"Once the signal quality is tuned for your repo, change fail-on-findings to "true" to block risky changes.
| Input | Default | Description |
|---|---|---|
commit |
PR head commit | Commit SHA to analyze |
no-llm |
true |
Run deterministic rules only |
fail-on-findings |
true |
Fail the check when findings are produced |
inline-comments |
false |
Post findings as inline PR comments |
ascii |
true |
Use ASCII-only output |
dotnet-version |
8.0.x |
.NET SDK version |
gauntletci-version |
2.0.0 |
NuGet tool version to install |
gauntletci analyze --staged
gauntletci analyze --commit <sha>- Not a linter
- Not a static analysis replacement
- Not a test runner
- Not a formatter
GauntletCI focuses only on change-risk, not general code quality.
- No change-risk signals were identified
- This does not guarantee correctness
- It indicates no high-confidence risks were found
A GauntletCI finding is not a claim that the code is definitely broken.
Treat it as a review prompt:
- Confirm whether the behavior changed.
- Check whether tests or validation cover the changed path.
- Add validation, update tests, or document why the change is intentional.
- Suppress only when the risk is understood and accepted.
LLM integration enhances explanation only.
- All detection logic is deterministic
- Runs locally via Ollama
- No data leaves your machine
- All analysis runs locally
- No code leaves your machine
- Auto-redaction prevents sensitive data exposure
- Telemetry is optional and anonymous
Elastic License 2.0

