Skip to content

fix: P0+P1 hotfix batch — KV-cache crash, default features, allowlist, syslog, synthesis, tiering#181

Merged
Shreyas582 merged 10 commits intomainfrom
fix/v1.8.1-v1.9.0-hotfix-batch
Apr 7, 2026
Merged

fix: P0+P1 hotfix batch — KV-cache crash, default features, allowlist, syslog, synthesis, tiering#181
Shreyas582 merged 10 commits intomainfrom
fix/v1.8.1-v1.9.0-hotfix-batch

Conversation

@Shreyas582
Copy link
Copy Markdown
Owner

P0+P1 Hotfix Batch — 6 issues across v1.8.1 and v1.9.0

P0 Critical (v1.8.1 — must ship ASAP)

#147 — KV-cache attention mask off-by-one crashes all live inference

  • Removed erroneous +1 from decode-step attention_len in both run_prompt and run_prompt_cached
  • Hoisted initial_cache_len so the decode loop accounts for prior cache padding
  • Affects: inference_bridge/src/onnx_vitis.rs

#149 — Default build has no inference features — --live silently falls back to dry-run

  • Changed cli/Cargo.toml default features from [] to ["onnx"] so cargo install works out of the box
  • Added compile-time bail! when --live is used on a no-inference build
  • Affects: cli/Cargo.toml, cli/src/main.rs

P1 Findings Quality (v1.9.0)

#151sc and wmic missing from Windows command allowlist

  • Added sc, wmic, schtasks to WINDOWS_COMMAND_ALLOWLIST
  • Unblocks priv-esc-review and scheduled task enumeration on Windows
  • Affects: cyber_tools/src/lib.rs

#153read_syslog dry-run defaults to README.md — produces bogus findings

  • Replaced ./README.md fallback with platform-appropriate log paths (/var/log/syslog on Linux, System.evtx on Windows)
  • Fixed check_tool_precondition to use the same platform path
  • Affects: inference_bridge/src/lib.rs, core_engine/src/agent.rs

#155 — Basic tier final_answer is mechanical concatenation, not actionable synthesis

  • Rewrote basic_tier_summary_for_task to group findings by severity, add cross-reference hints, deduplicate actions, and present prioritized recommendations
  • Affects: core_engine/src/lib.rs

#157PARAM_BASIC_CEILING_B=2.0 too high — all common 1B models skip LLM

  • Lowered threshold from 2.0 → 1.0 so 1B+ models (Qwen2.5-0.5B, Llama-3.2-1B) reach Moderate tier and actually use inference
  • Affects: core_engine/src/lib.rs

Testing

  • 120 unit tests passing (75 core_engine + 40 inference_bridge + 5 cyber_tools)
  • Updated 3 tests to match new behavior (summary format, tier threshold)
  • Compilation clean across all targets

Closes #147, closes #149, closes #151, closes #153, closes #155, closes #157

…, syslog, synthesis, tiering

Fixes #147: KV-cache attention mask off-by-one crashes all live inference
  - Remove erroneous +1 from decode-step attention_len in both run_prompt
    and run_prompt_cached; hoist initial_cache_len so decode loop can
    account for prior cache padding.

Fixes #149: Default build has no inference features
  - Change cli/Cargo.toml default features from [] to [onnx] so cargo
    install produces a working --live binary out of the box.
  - Add compile-time bail when --live is used on a no-inference build.

Fixes #151: sc/wmic missing from Windows command allowlist
  - Add sc, wmic, schtasks to WINDOWS_COMMAND_ALLOWLIST so priv-esc-review
    and scheduled-task enumeration work on Windows.

Fixes #153: read_syslog dry-run defaults to README.md
  - Replace ./README.md fallback with platform-appropriate log path
    (/var/log/syslog on Linux, System.evtx on Windows).
  - Fix check_tool_precondition to use the same platform path.

Fixes #155: Basic tier final_answer is mechanical concatenation
  - Rewrite basic_tier_summary_for_task to group findings by severity,
    add cross-reference hints, deduplicate actions, and present
    prioritized recommendations.

Fixes #157: PARAM_BASIC_CEILING_B=2.0 too high
  - Lower threshold from 2.0 to 1.0 so 1B+ models (Qwen2.5-0.5B,
    Llama-3.2-1B) reach Moderate tier and actually use inference.
The stdin-integration job compiles the full CLI binary from scratch
without any cargo caching, causing it to routinely exceed the 20-minute
timeout on GitHub Actions runners. Add Swatinem/rust-cache@v2 (matching
the cross-platform job) and bump timeout to 30 minutes.
… checks

When the onnx feature is enabled (now the default), build_session() could
hang indefinitely in two scenarios:
1. onnxruntime DLL not found: Session::builder() blocks on dynamic loader
2. Corrupt/stub model file: commit_from_file() never returns

Added ensure_ort_dylib_available() to bail early when the runtime library
cannot be located on PATH or via ORT_DYLIB_PATH, and
validate_model_preamble() to reject files that don't start with a valid
protobuf field tag before reaching the ONNX Runtime.

Also fixes three clippy warnings newly exposed by the onnx default feature:
- dead_code on push_warn (cfg gate narrowed)
- too_many_arguments on run_prompt_shared_buffer (allow attribute)
- unnecessary_to_owned on suffix.to_vec() (removed)
The live-success-e2e job used PowerShell's echo/>> operator to append the
cargo bin directory to GITHUB_PATH. On PowerShell 5.1 the >> operator
writes UTF-16 LE. GitHub Actions expects UTF-8 in the GITHUB_PATH file,
so subsequent steps (rust-cache, cargo test) could not find cargo/rustc.

Fix: use Add-Content with -Encoding UTF8 and write unconditionally so
every run guarantees PATH propagation regardless of prior runner state.
…step

PowerShell 5.1's Add-Content -Encoding UTF8 writes a BOM prefix that
corrupts the GITHUB_PATH file. Switch to [IO.File]::AppendAllText which
writes BOM-free UTF-8.

Also add an explicit cargo bin PATH fallback in the test step itself so
it works even if GITHUB_PATH propagation fails on the self-hosted runner.
@Shreyas582 Shreyas582 merged commit dc97996 into main Apr 7, 2026
9 checks passed
@Shreyas582 Shreyas582 deleted the fix/v1.8.1-v1.9.0-hotfix-batch branch April 7, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment