Shamir Secret Sharing for HGT + multi-tenant shards — design exploration #5
SwiftWing21
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Shamir Secret Sharing for HGT + multi-tenant shards — design exploration
Filed by raude — Claude Code Opus 4.6 (1M context). Design-note discussion, not a ticket. Captures a future direction so it doesn't get lost in chat scrollback.
Origin
During the 2026-04-10 session the user flagged a BuildStuff conference talk by Tejas Chopra on Shamir Secret Sharing (SSS) — the same Tejas whose Headroom toolkit we just adopted in v0.3.0b5. The (joking) question was "does this help audio ingestion, and is it already in Tejas's tools?" The serious answer is no — Headroom is LLM compression, SSS is cryptography, and they don't meaningfully intersect the audio problem. But SSS does intersect helix-context in three other places that are worth writing down before we forget.
What SSS is (one-paragraph refresher)
Shamir Secret Sharing takes a secret
Sand splits it intoNshares using polynomial interpolation. AnyKshares (the threshold) can reconstructSvia Lagrange interpolation; strictly fewer thanKshares reveal zero information — information-theoretic security, not computational, meaning even unbounded compute can't extract the secret fromK-1shares. There's no key, no algorithm attack surface, no post-quantum concern. Just math.Where SSS would shine in helix-context
1. Privacy-preserving Horizontal Gene Transfer (
hgt.py)Today
hgt.pyexports full genes to.helixfiles and re-imports them on another Helix instance. Gene content is cleartext, which is fine for public knowledge but blocks any scenario where two parties want to share derived understanding without exposing the raw source. Examples:SSS-backed HGT: instead of exporting
Gene.content, export{gene_id, [share_1, share_2, share_3]}where each share goes to a different party. Any 2 of 3 parties can reconstruct the gene; no single party can. The biology metaphor lands perfectly — actual biological HGT involves fragment transfer with reconstruction at the recipient.This is genuinely novel. I could not find a single paper or production system titled "federated RAG with information-theoretic secrecy across shards." Someone should build it; helix-context is positioned for it.
2. Multi-tenant enterprise genome isolation (v0.400 roadmap)
The ROADMAP.md already has multi-tenant SaaS as a v0.400 concern. SSS could back a "genome shard" architecture where:
This pairs naturally with the existing
[genome]replicas config inhelix.toml— today replicas are simple mirrors (every replica has the full DB), tomorrow they could be SSS shards with a configurablethresholdandshare_count.3. Audit-logged gene unlocking with K-of-N approvers
For sensitive genes (medical records, legal filings, HR notes, board minutes), encrypt the gene content with a key, then SSS-split the key across N approvers. Reading the gene requires K of them to co-sign a reconstruction. Every reconstruction is logged: who provided their share, when, and what gene was unlocked.
This maps onto helix's existing
chromatinstate — add a newLOCKEDstate that requires K-of-N to demote back toOPENfor a query.Where SSS does NOT help
Nshares is actually larger than the secret. Use Headroom for compression, SSS for distribution.Kshares requires contactingKstores and doing Lagrange interpolation. Adds latency. Viable for cold-storage HETEROCHROMATIN genes, probably overkill for hot OPEN genes.Engineering scope estimate (if this ever becomes a real ticket)
kandnconfigurable) OR adopt an existing library likepysssorsecretsharinghelix_context/sss_bridge.pywrapper (lazy import, optional extra)hgt.pyexport mode:--threshold K --shares N --parties file1,file2,file3hgt.pyimport mode: reconstruct from K of N share filesK-1 reveals nothing / K reveals everythinginvarianthelix.toml(replace mirror mode with shard mode)LOCKEDchromatin state + K-of-N unlocking workflow + audit logschemas.py,genome.py, HTTP layerTotal first meaningful slice (HGT export/import only): ~4 days. Not an MVP for v0.4.0 — more like a v0.5.0 or v0.4.x stretch goal, depending on whether federated/multi-tenant work picks up.
Why file this as a design discussion, not an issue
The three use cases above are all "future direction" rather than "open bug." Discussion is the right venue for capturing the exploration without committing to a sprint. When the SaaS/multi-tenant work picks up, the search for "how do we do federated storage" will surface this, and the implementer won't have to re-derive the design from scratch.
Adjacent reading
pysss— pure-Python SSS implementationA note on scope honesty
helix-context is a single-machine genome today. SSS is a multi-party primitive. Adopting SSS for its own sake on a single machine is wasteful (the whole secret lives in the same process anyway). SSS only pays off when there are multiple parties or multiple trust domains involved — federated fleets, multi-tenant SaaS, encrypted rest-state with audit-logged unlock. If helix-context never goes federated, this note stays filed as "explored but not needed." That's a valid outcome.
— raude, Claude Code Opus 4.6 (1M context)
Session: 2026-04-10 / helix-context v0.3.0b5 / research thread branching off audio ingestion
Beta Was this translation helpful? Give feedback.
All reactions