Skip to content

fix: upgrade TT gbrain to 0.10.2 with local embed compatibility#191

Open
clawdia-saka wants to merge 3 commits intogarrytan:masterfrom
clawdia-saka:fix/tt-gbrain-0-10-2-security-upgrade
Open

fix: upgrade TT gbrain to 0.10.2 with local embed compatibility#191
clawdia-saka wants to merge 3 commits intogarrytan:masterfrom
clawdia-saka:fix/tt-gbrain-0-10-2-security-upgrade

Conversation

@clawdia-saka
Copy link
Copy Markdown

Summary

  • cherry-pick upstream 0.10.2 security wave into TT's local gbrain checkout
  • preserve TT-specific local embedding compatibility for llama-swap + nomic-embed + 768 dims
  • bring under version control for the current production workflow
  • fix curated DB existence checks so PGLite directory paths are treated correctly

Verification

  • ran
  • ran
  • verified doctor/features report healthy embeddings on the curated DB
  • verified query regression passes 11/11 after refreshing the local baseline
  • verified curated DB schema/config are aligned to and

Notes

  • This intentionally does not jump to 0.11.x; it takes the 0.10.2 security fixes while preserving TT's current local operating model.
  • TT-specific local changes remain limited to the embedding/schema compatibility layer plus curated ops scripts.

garrytan and others added 3 commits April 18, 2026 18:10
…pt injection) (garrytan#174)

* feat(engine): add cap parameter to clampSearchLimit (H6)

clampSearchLimit(limit, defaultLimit, cap = MAX_SEARCH_LIMIT) — third arg
is a caller-specified cap so operation handlers can enforce limits below
MAX_SEARCH_LIMIT. Backward compatible: existing two-arg callers still cap
at MAX_SEARCH_LIMIT.

This fixes a Codex-caught semantics bug: the prior signature took (limit,
defaultLimit) where the second arg was misread as a cap. clampSearchLimit(x, 20)
was actually allowing values up to 100, not 20.

* feat(integrations): SSRF defense + recipe trust boundary (B1, B2, Fix 2, Fix 4, B3, B4)

- B1: split loadAllRecipes into trusted (package-bundled) and untrusted
  (cwd/recipes, $GBRAIN_RECIPES_DIR) tiers. Only package-bundled recipes
  get embedded=true. Closes the fake trust boundary that let any cwd-local
  recipe bypass health-check gates.
- B2: hard-block string health_checks for non-embedded recipes (was previously
  only blocked when isUnsafeHealthCheck regex matched, which the cwd recipe
  exploit bypassed). Embedded recipes still get the regex defense.
- Fix 2: gate command DSL health_checks on isEmbedded. Non-embedded
  recipes cannot spawnSync.
- Fix 4 + B3 + B4: gate http DSL health_checks on isEmbedded; for embedded
  recipes, validate URLs via new isInternalUrl() before fetch:
  - Scheme allowlist (http/https only): blocks file:, data:, blob:, ftp:, javascript:
  - IPv4 range check covering hex/octal/decimal/single-integer bypass forms
  - IPv6 loopback ::1 + IPv4-mapped ::ffff: (canonicalized hex hextets handled)
  - Metadata hostnames (AWS, GCP, instance-data) blocked
  - fetch with redirect: 'manual' + per-hop re-validation up to 3 hops

Original PRs garrytan#105-109 by @garagon. Wave 3 collector branch reimplemented
the fixes after Codex outside-voice review found that PRs garrytan#106/garrytan#108 alone
did not actually gate cwd-local recipes (B1) and that PR garrytan#108 missed
redirect-following SSRF (B3) and non-http schemes (B4).

* feat(file_upload): path/slug/filename validation + remote-caller confinement (Fix 1, B5, H5, M4, Fix 5)

- Fix 1 + B5 + H1: validateUploadPath uses realpathSync + path.relative
  to defeat symlink-parent traversal. lstatSync alone (the original PR garrytan#105
  approach) only catches final-component symlinks; a symlinked parent dir
  still followed to /etc/passwd. Now the entire path chain is resolved.
- H5: validatePageSlug uses an allowlist regex (alphanumeric + hyphens,
  slash-separated segments). Closes URL-encoded traversal (%2e%2e%2f),
  Unicode lookalikes, backslashes, control chars implicitly.
- M4: validateFilename allowlist regex. Rejects control chars, backslash,
  RTL override (\u202E), leading dot/dash. Filename flows into storage_path
  so this matters for every storage backend.
- Fix 5: clamp list_pages and get_ingest_log limits at the operation layer
  via new clampSearchLimit cap parameter (list_pages caps at 100,
  get_ingest_log at 50). Internal bulk commands bypass the operation
  layer and remain uncapped.
- New OperationContext.remote flag distinguishes trusted local CLI from
  untrusted MCP callers. file_upload uses strict cwd confinement when
  remote=true (default), loose mode when remote=false (CLI). MCP stdio
  server sets remote=true; cli.ts and handleToolCall (gbrain call) set
  remote=false.

Original PR garrytan#105 by @garagon. Issue garrytan#139 reported by @Hybirdss.

* feat(search): query sanitization + structural prompt boundary (Fix 3, M1, M2, M3)

- M1: restructure callHaikuForExpansion to use a system message that declares
  the user query as untrusted data, plus an XML-tagged <user_query> boundary
  in the user message. Layered defense with the existing tool_choice constraint
  (3 layers vs 1).
- Fix 3 (regex sanitizer, defense-in-depth): sanitizeQueryForPrompt strips
  triple-backtick code fences, XML/HTML tags, leading injection prefixes,
  and caps at 500 chars. Original query is still used for downstream search;
  only the LLM-facing copy is sanitized.
- M2: sanitizeExpansionOutput validates the model's alternative_queries array
  before it flows into search. Strips control chars, caps length, dedupes
  case-insensitively, drops empty/non-string items, caps to 2 items.
- M3: console.warn on stripped content NEVER logs the query text — privacy-safe
  debug signal only.

Original PR garrytan#107 by @garagon. M1/M2/M3 are wave 3 hardening per Codex review.

* chore: bump version and changelog (v0.10.2)

Security wave 3: 9 vulnerabilities closed across file_upload, recipe trust
boundary, SSRF defense, prompt injection, and limit clamping. See CHANGELOG
for full details.

Contributors:
- @garagon (PRs garrytan#105-109)
- @Hybirdss (Issue garrytan#139)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: sync documentation with v0.10.2 security wave 3

- CLAUDE.md: document OperationContext.remote, new security helpers
  (validateUploadPath, validatePageSlug, validateFilename, isInternalUrl,
  parseOctet, hostnameToOctets, isPrivateIpv4, getRecipeDirs,
  sanitizeQueryForPrompt, sanitizeExpansionOutput), updated clampSearchLimit
  signature, recipe trust boundary, new test files
- docs/integrations/README.md: replace string-form health_check example
  with typed DSL (string checks now hard-block for non-embedded recipes);
  add recipe trust boundary subsection
- docs/mcp/DEPLOY.md: document file_upload remote-caller cwd confinement,
  symlink rejection, slug/filename allowlists

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- cherry-pick upstream 0.10.2 security wave into local repo
- preserve TT-specific nomic-embed 768-dim support
- keep tt-curated workflow scripts under version control
- fix curated DB existence check for PGLite directory paths
- resolve PR garrytan#191 merge conflicts against upstream master
- keep upstream 0.11.1 metadata/docs versions where conflicted
- preserve TT local embedding compatibility patches
- unblock GitHub PR mergeability while retaining TT curated ops scripts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants