05 Apr 10:01

5955ea0

v1.8.0 Latest

Latest

WraithRun v1.8.0

This release addresses 8 critical bugs discovered during a comprehensive 20-test live evaluation using Qwen2.5-0.5B and Llama-3.2-1B models. Every fix was verified against the original failing test scenarios.

Added

Syslog analysis template — new syslog-analysis investigation template triggered by keywords like log, syslog, journal, event, audit. Runs read_syslog → audit_account_changes → inspect_persistence_locations. Use with --task-template syslog-summary. (#141)
SSH key enumeration tool — new enumerate_ssh_keys tool performs cross-platform scanning of .ssh directories for authorized_keys, private keys, and public keys. (#141)

Changed

Severity calibration — raised listener thresholds (Info <50, Low 50–149, Medium 150–249, High ≥250), lowered account severity, and raised persistence thresholds. Normal desktops no longer trigger spurious high-severity findings. (#139)
Richer finding titles — finding titles now include specifics (account names, persistence entry text, SSH directory info) instead of bare counts. (#140)
Quantization-aware parameter estimation — the hardcoded 2.2 divisor is replaced with a format-aware divisor: Q4 → 0.55, Q8 → 1.1, FP16 → 2.2, FP32 → 4.4. Detected automatically from model filename conventions. (#138)
Template tool ordering — file-integrity-check now leads with hash_binary; ssh-key-investigation now leads with enumerate_ssh_keys. (#141)

Fixed

KV-cache attention mask crash — prefill attention length now accounts for forced cache padding when the model lacks a use_cache toggle, preventing shape broadcast errors on models like Qwen2.5 and Llama 3.2. (#136)
ReAct hallucination guard — when the model produces a <final> tag at step 0 without calling any tools, the agent falls back to template-driven execution. Quality guard detects hallucinated <call> tags and [observation] markers and replaces them with a deterministic summary. (#137)
EP reporting — detect_execution_provider() now recognises DirectML and CUDA backend overrides instead of always reporting CPU. (#142)

Stats

282 tests passing, 0 failures
All 8 CI jobs green (Quality Gates, Cross-platform compile ×3, CLI stdin ×2, Live metrics benchmark, Live success e2e)
Fixes validated live against both Qwen2.5-0.5B-Instruct and Llama-3.2-1B-Instruct

Full Changelog: v1.7.1...v1.8.0

Assets 11

05 Apr 08:28

Shreyas582

v1.7.1

311ad3d

v1.7.1: Dependency Updates

A small patch release that brings all dependencies up to their latest major versions.

toml has been bumped from 0.8 to 1.1, picking up the TOML spec 1.1 support. thiserror moved from 1.0 to 2.0 with no API changes needed on our side. sha2 was already bumped to 0.11 in v1.7.0, but we had to fix a formatting issue where the new digest output type no longer implements LowerHex directly. The same fix was applied in both the CLI and cyber_tools hash functions.

All CI action dependencies were also updated: actions/checkout to v6, actions/upload-artifact to v7, actions/download-artifact to v8, actions/setup-python to v6, and release-drafter/release-drafter to v7.

282 tests passing, clean clippy.

Full Changelog: v1.7.0...v1.7.1

Assets 11

05 Apr 08:08

Shreyas582

v1.7.0

c7707fc

v1.7.0: Live Inference Fix

This release ships 16 improvements found during a thorough live-mode testing audit of v1.6.0 across CPU, NPU, and GPU backends.

Correctness

The KV-cache decode loop had an off-by-one error that could cause the attention mask length to drift during multi-token generation on the ONNX Vitis backend. That's fixed now. Model parameter estimation also wasn't accounting for external .onnx_data files, which meant some models were being classified a tier lower than they should have been. The CLI crate now properly forwards the directml, cuda, tensorrt, and qnn feature flags to inference_bridge, so building with those features actually works.

Dry-Run and Usability

Dry-run mode got a significant overhaul. The old keyword-matching approach for picking response templates has been replaced with a scored template routing system that picks the best match from 10 built-in templates. Dry-run also used to repeat the same tool in every iteration instead of rotating through the full template, which made multi-tool investigations look broken. That's fixed.

For live inference, the agent now extracts chain-of-thought reasoning from the LLM output before looking for tool-call tags, which means the model's reasoning is preserved in the turn history instead of being silently discarded. There's also a new stderr warning when the detected model is too small for its capability tier, so you'll know early if your 0.5B model is being asked to run a Moderate-tier ReAct loop.

Investigation Quality

Confidence scores now get a corroboration boost when multiple tools independently report related findings. Instead of each finding's confidence being purely formula-driven, findings backed by 2+ tools get a small bump, making the scores more reflective of actual evidence strength.

The Basic-tier deterministic summary is now task-aware, so instead of generic "2 findings detected" output you'll see something like Task "windows-triage" produced 3 findings across 2 tool(s).

Expanded Security Checks

The privilege escalation tool now checks 9 Windows token privileges (up from 4), queries for the AlwaysInstallElevated registry key, and scans for unquoted service paths. On Linux it also picks up setuid, setgid, and (root) indicators from sudo output.

Persistence scanning expanded significantly. On Windows it now covers RunOnce, Winlogon, Image File Execution Options, and AppInit_DLLs registry keys. On Linux it checks additional cron directories, /etc/xdg/autostart, user-level systemd units, and user crontab spools. The suspicious entry detection also got more markers including mshta, regsvr32, certutil, bitsadmin, and others commonly abused for persistence.

Tooling and Discovery

Tokenizer auto-discovery now searches the grandparent directory of the model file, which handles the common HuggingFace layout where model.onnx lives inside an onnx/ subdirectory.

The --models-list command now auto-discovers .onnx files in the ./models directory that aren't already referenced by a configured profile, so local models show up without needing manual config entries.

Model download checksum verification was always failing because the manifest contained placeholder SHA-256 strings. Downloads now detect placeholder checksums and report the actual hash instead of a misleading mismatch error.

Each tool execution now records elapsed_ms on its AgentTurn, giving you per-tool timing visibility in the run report output.

Resolved Issues

#114, #115, #116, #117, #118, #119, #120, #121, #122, #123, #124, #125, #126, #127, #128, #129

Full Changelog: v1.6.0...v1.7.0

Assets 2

05 Apr 05:43

Shreyas582

v1.6.0

10ed64c

v1.6.0: Agentic Investigation Engine

Highlights

WraithRun's agent can now reason iteratively about investigations using a full ReAct (Reason + Act) loop. Moderate and Strong-tier investigations call tools dynamically based on LLM reasoning rather than following a fixed template, producing deeper and more relevant findings.

Features

ReAct Agent Loop (#92)

Moderate/Strong investigation tiers now use an LLM-guided ReAct loop with dynamic tool dispatch
The agent reasons about which tool to call next based on observations so far
Basic tier retains fast template-driven execution for simple tasks
Automatic fallback to template synthesis if the LLM exhausts its step budget

Task-aware LLM Synthesis (#93)

Synthesis prompts now include the verbatim investigation task for better context
Structured output sections (Summary, Key Findings, Risk Assessment, Recommendations)
Evidence budget increased from 1,500 to 3,000 chars per observation

Temperature-scaled Sampling (#66)

Configurable temperature parameter for LLM token generation
Softmax probability sampling when temperature > 0; greedy decoding when temperature Γëñ 0
Enables creative vs. deterministic output control per investigation

EP-aware Debug Logs (#67)

All inference debug messages now include the active execution provider (DirectML, CoreML, CUDA, TensorRT, QNN, CPU)
Replaces hardcoded "Vitis" labels with runtime-detected EP names

ONNX Session Caching (#64)

SessionCache struct lazily initializes and reuses the ONNX session and tokenizer across investigation steps
Eliminates per-step session rebuild overhead for multi-step investigations

KV-cache Prefix Reuse (#65)

Prefix detection framework compares current prompt tokens against previous invocation
Hit/miss metrics tracked per session for observability
Scaffolded for full KV-state reuse pending upstream DynValue clonability

Model Pack Download (#94)

--model-download <NAME> CLI command with curated model manifest
--model-download list shows available packs (tinyllama-1.1b-chat, phi-2-2.7b, qwen2-0.5b)
SHA-256 checksum verification after download; skips if model already present

Testing

281 tests passing (up from 274 in v1.5.0)
New tests: ReAct parsing, tier dispatch, prompt formatting, unknown tool handling

What's Changed

feat: complete v1.3.0: doctor diagnostics, --backend flag, conformance tests by @Shreyas582 in #111
feat: v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness by @Shreyas582 in #112
feat: v1.6.0: Agentic Investigation Engine by @Shreyas582 in #113

Full Changelog: v1.3.1...v1.6.0

Contributors

Shreyas582

Assets 11

05 Apr 04:59

Shreyas582

v1.5.0

10ed64c

v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness

This release ships six concrete hardware backend implementations and foundational support for multi-format models and quantization awareness, completing the v1.5.0 Concrete Hardware Backends milestone.

New Backends (Feature-Gated)

All backends implement ExecutionProviderBackend with runtime availability probing, diagnostics, config keys, and dry-run session support.

Backend	Feature Flag	Priority	Platform
DirectML	`directml`	100	Windows (any DX12 GPU)
CoreML	`coreml`	100	macOS / Apple Silicon
CUDA	`cuda`	200	Linux/Windows (NVIDIA GPU)
TensorRT	`tensorrt`	250	Linux/Windows (NVIDIA + TRT SDK)
QNN	`qnn`	280	Windows ARM64 (Snapdragon X)

Enable one or more at build time:

cargo build --features directml    # Windows GPU
cargo build --features cuda        # NVIDIA GPU
cargo build --features tensorrt    # NVIDIA + TensorRT

Model Format Support (#60)

ModelFormat enum: Onnx, Gguf, SafeTensors
ModelFormat::from_path() auto-detects format from file extension
ExecutionProviderBackend::supported_formats() trait method (default: [Onnx])

Quantization Awareness (#61)

QuantFormat enum: Fp32, Fp16, Int8, Int4, BlockQuantized(name), Unknown
QuantFormat::detect_from_path() infers quantization from filename conventions
ExecutionProviderBackend::supported_quant_formats(): each backend declares its efficient formats
NPU backends (Vitis, QNN) support INT8/INT4; GPU backends support FP16/FP32/INT8

Testing

274 tests passing (up from 259)
Backend conformance macro expanded with format/quant coverage tests
All new backends have cfg-gated conformance suites ready

Closes

Closes #55, closes #56, closes #57, closes #58, closes #59, closes #60, closes #61

Full Changelog: v1.4.0...v1.5.0

What's Changed

feat: complete v1.3.0: doctor diagnostics, --backend flag, conformance tests by @Shreyas582 in #111
feat: v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness by @Shreyas582 in #112
feat: v1.6.0: Agentic Investigation Engine by @Shreyas582 in #113

Full Changelog: v1.3.1...v1.5.0

Contributors

Shreyas582

Assets 11

05 Apr 04:44

Shreyas582

v1.4.0

10ed64c

v1.4.0: Doctor Diagnostics, Backend Selection, Conformance Tests

v1.4.0: Complete v1.3.0 Milestone: Doctor Diagnostics, Backend Selection, Conformance Tests

This release completes the v1.3.0 Multi-Backend Inference Abstraction milestone by shipping the final three issues: provider-aware doctor diagnostics, CLI backend selection, and a multi-backend conformance test harness.

New Features

Provider-Aware Doctor Diagnostics (#52)

wraithrun doctor now enumerates all registered inference backends and reports:

Backend name, priority, and availability status
Per-backend diagnostic entries (check name, status, detail message)
Structured backends array in the JSON doctor report

CLI `--backend` Flag and Auto-Select (#53)

Choose your inference backend explicitly or let the engine pick:

--backend <NAME> CLI flag
WRAITHRUN_BACKEND environment variable
[inference] backend = "..." in TOML config
"auto" (default) selects the highest-priority available backend
Helpful error messages list available backends if an invalid name is given

Integration Test Harness (#54)

A backend_contract_tests! macro generates 9 contract tests per backend:

Name, priority, availability, config keys, diagnostics, dry-run session
5 registry-level tests verify discovery, ordering, and fallback behavior
CPU conformance always runs; Vitis conformance is feature-gated behind vitis
14 new tests bringing the total to 259 passing tests

Other Changes

RunReport now includes an optional backend field recording which backend was used
run-report.schema.json and doctor-introspection.schema.json updated with new fields
wraithrun.example.toml includes new [inference] section

Milestone

Closes the v1.3.0 Multi-Backend Inference Abstraction milestone.

Full Changelog: v1.3.1...v1.4.0

What's Changed

feat: complete v1.3.0: doctor diagnostics, --backend flag, conformance tests by @Shreyas582 in #111
feat: v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness by @Shreyas582 in #112
feat: v1.6.0: Agentic Investigation Engine by @Shreyas582 in #113

Full Changelog: v1.3.1...v1.4.0

Contributors

Shreyas582

Assets 11

05 Apr 03:52

github-actions

v1.3.1

2ba1a53

v1.3.1

v1.3.1: Provider-Agnostic ModelConfig

This release refactors ModelConfig to use generic backend configuration fields, decoupling inference configuration from any specific hardware provider. This is the next step in the v1.3.0 Multi-Backend Inference Abstraction milestone.

What changed

Provider-agnostic ModelConfig (#49)

ModelConfig.vitis_config: Option<VitisEpConfig> replaced with:
- backend_override: Option<String>: optional backend name hint (e.g. "vitis")
- backend_config: HashMap<String, String>: generic key-value config map
Both fields use #[serde(default)] for backward-compatible deserialization
VitisEpConfig retained as a CLI-level helper with into_backend_config() / from_backend_config() conversion methods

Vitis EP reads generic config (#50)

discover_ort_dylib_path(), build_base_session_builder_with_provider(), and build_session_with_vitis_cascade() now read from the generic backend_config map instead of the Vitis-specific struct

CPU EP decoupled from Vitis types (#51)

All non-Vitis callers (CpuBackend, API server, tests) use backend_override: None, backend_config: Default::default()
Zero coupling to Vitis-specific types

Testing

245 tests passing across all 5 crates.

Migration

See upgrade notes for before/after code examples.

Assets 11

05 Apr 02:38

github-actions

v1.3.0

28c9ca7

v1.3.0

v1.3.0: CI/CD Pipeline Integration & Multi-Backend Inference Foundation

This release closes the v1.2.0 milestone (#103) and begins the v1.3.0 Multi-Backend Inference Abstraction work (#47, #48).

CI/CD Pipeline Integration (#103)

GitHub composite Action (action.yml): first-party Shreyas582/wraithrun-action@v1 with version resolution, binary caching (GitHub Actions cache), cross-platform install (Linux/macOS/Windows), scan execution, and JSON finding extraction via python3.
Example workflow (.github/workflows/wraithrun-scan.example.yml): push/PR/schedule triggers, artifact upload, GitHub step summary.
GitLab CI template (ci-templates/gitlab-ci.yml): ubuntu:22.04-based, configurable via CI variables.
Generic shell script (ci-templates/wraithrun-scan.sh): environment-variable driven for Jenkins, CircleCI, and other platforms.
CI integration guide (docs/ci-integration.md): step-by-step docs covering all platforms, exit code policy, output formats, scheduled scanning, and interpreting results.

ExecutionProviderBackend Trait (#47)

New inference_bridge::backend module introducing:

ExecutionProviderBackend trait: name(), is_available(), priority(), config_keys(), diagnose(), build_session()
InferenceSession trait for provider-created inference sessions
DiagnosticEntry / DiagnosticSeverity types for doctor integration
BackendOptions type alias for provider-specific config passthrough

Built-in implementations:

CpuBackend: always available, priority 0, dry-run + ONNX CPU support
VitisBackend (cfg-gated vitis feature): AMD Vitis AI NPU, priority 300, environment-based availability detection

Provider Registry (#48)

ProviderRegistry with discover(), best_available(), get(), list(), available_names()
build_session_with_fallback(): tries preferred backend first, then cascades by descending priority
ProviderInfo struct for backend metadata listing

Infrastructure

12 new unit tests (245 total, all passing)
CHANGELOG and upgrade notes updated
Version bump to 1.3.0

Full Changelog: v1.2.0...v1.3.0

Assets 11

05 Apr 00:38

Shreyas582

v1.2.0

7172a61

WraithRun v1.2.0

Dashboard UX Overhaul (#99)

5-tab layout: Runs, Findings, Cases, Compare, Health: organized investigation workflow
SVG donut severity charts per-run for at-a-glance risk assessment
Clickable evidence chains with toggle visibility for finding details
Run comparison diff view showing new, resolved, and changed-severity findings
JSON/CSV export for both aggregate findings and per-run data
Real-time progress spinners for in-progress runs
Cases tab with case list and detail panel

Tool Plugin API (#102)

Extend WraithRun with external tool plugins via tool.toml manifests and subprocess JSON I/O.

--tools-dir and --allowed-plugins CLI flags for plugin discovery
Automatic platform filtering, sandbox policy enforcement, and timeout support
Plugin tools visible in --doctor output and /api/v1/runtime/status endpoint
Example plugin: examples/tools/hello_world/
Full documentation: docs/plugin-api.md

Security Professional Documentation (#100)

4 investigation playbooks: SSH key compromise, Windows triage, credential leak audit, persistence sweep
MITRE ATT&CK mapping for all 8 built-in tools with tactic coverage analysis
Threat model with attack surface, trust boundaries, and security controls
2 sample investigation reports (anonymized Linux persistence and Windows triage)

Other

Added io-util feature to workspace tokio dependency for plugin subprocess I/O
233 tests passing across all crates

Full Changelog: v1.1.0...v1.2.0

What's Changed

feat: Dashboard UX, Security Docs, Tool Plugin API (v1.2.0) by @Shreyas582 in #107
chore: bump version to 1.2.0 by @Shreyas582 in #108

Full Changelog: v1.0.0...v1.2.0

Contributors

Shreyas582

Assets 11

04 Apr 22:44

Shreyas582

v1.1.0

d9b573e

v1.1.0: Professional Workflow Depth

Building on v1.0.0's Local API and Web UI foundation, this release adds three features that bring WraithRun closer to professional security workflow integration.

What's New

Structured JSON Audit Logging (#98)

Every API action is now logged as a structured JSON event, including authentication attempts, run lifecycle events, case operations, and server lifecycle. Events are written to a JSON lines file and held in a ring buffer for real-time querying.

12 event types: AuthSuccess, AuthFailure, RunCreated, RunCompleted, RunFailed, RunCancelled, CaseCreated, CaseUpdated, ToolExecuted, ToolPolicyDenied, ServerStarted, ServerStopped
CLI flag: --audit-log <PATH> enables file-based audit trail
API endpoint: GET /api/v1/audit/events?limit=N returns recent events

Case Management API (#97)

Group related investigation runs under cases for tracking and organization.

Create cases: POST /api/v1/cases with title and optional description
List cases: GET /api/v1/cases with run count aggregates
View case: GET /api/v1/cases/{id} with linked run statistics
Update case: PATCH /api/v1/cases/{id} to change title, description, or status (open ΓåÆ investigating ΓåÆ closed)
List case runs: GET /api/v1/cases/{id}/runs
Link runs to cases: Pass case_id in POST /api/v1/runs request body
SQLite schema v2 migration runs automatically on existing databases

Evidence-Backed Narrative Report Format (#96)

New --format narrative output produces analyst-ready investigation reports with structured sections:

Executive Summary: task, case reference, finding count, max severity, duration
Risk Assessment: severity distribution table (Critical/High/Medium/Low/Info)
Investigation Timeline: step-by-step tool execution log with observation summaries
Detailed Findings: each finding with confidence level, evidence chain, and recommended action
Supplementary Findings: lower-relevance observations
Conclusion: final analysis answer
Metadata: model tier, inference mode, live metrics

Testing

228 tests pass across all crates (23 API server, 66 core engine, 15 inference bridge, 79 CLI, 45 integration).

Upgrade Notes

Existing SQLite databases will be automatically migrated to schema v2 (adds cases table and case_id column on runs)
No breaking API changes. All v1.0.0 endpoints continue to work unchanged
New narrative format option available alongside existing json, summary, and markdown formats

Assets 11

Releases: Shreyas582/WraithRun

v1.8.0

WraithRun v1.8.0

Added

Changed

Fixed

Stats

Uh oh!

v1.7.1: Dependency Updates

v1.7.1: Dependency Updates

Uh oh!

v1.7.0: Live Inference Fix

v1.7.0: Live Inference Fix

Correctness

Dry-Run and Usability

Investigation Quality

Expanded Security Checks

Tooling and Discovery

Resolved Issues

Uh oh!

v1.6.0: Agentic Investigation Engine

v1.6.0: Agentic Investigation Engine

Highlights

Features

Testing

What's Changed

Contributors

Uh oh!

v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness

v1.5.0: Concrete Hardware Backends, Model Formats, Quantization Awareness

New Backends (Feature-Gated)

Model Format Support (#60)

Quantization Awareness (#61)

Testing

Closes

What's Changed

Contributors

Uh oh!

v1.4.0: Doctor Diagnostics, Backend Selection, Conformance Tests

v1.4.0: Complete v1.3.0 Milestone: Doctor Diagnostics, Backend Selection, Conformance Tests

New Features

Provider-Aware Doctor Diagnostics (#52)

CLI --backend Flag and Auto-Select (#53)

Integration Test Harness (#54)

Other Changes

Milestone

What's Changed

Contributors

Uh oh!

v1.3.1

v1.3.1: Provider-Agnostic ModelConfig

What changed

Testing

Migration

Uh oh!

v1.3.0

v1.3.0: CI/CD Pipeline Integration & Multi-Backend Inference Foundation

CI/CD Pipeline Integration (#103)

ExecutionProviderBackend Trait (#47)

Provider Registry (#48)

Infrastructure

Uh oh!

WraithRun v1.2.0

WraithRun v1.2.0

Dashboard UX Overhaul (#99)

Tool Plugin API (#102)

Security Professional Documentation (#100)

Other

What's Changed

Contributors

Uh oh!

v1.1.0: Professional Workflow Depth

v1.1.0: Professional Workflow Depth

What's New

Structured JSON Audit Logging (#98)

Case Management API (#97)

Evidence-Backed Narrative Report Format (#96)

Testing

Upgrade Notes

Uh oh!

CLI `--backend` Flag and Auto-Select (#53)