Blog: Hindsight Is #1 on BEAM — the Benchmark That Tests Memory at 10M Tokens by benfrank241 · Pull Request #851 · vectorize-io/hindsight

benfrank241 · 2026-04-02T14:24:33Z

Docusaurus version of hindsight-marketing-content#88.

What's in this post

Why the 10M token tier is the most important BEAM result
Full score table across all tiers (Hindsight vs Honcho vs paper baselines)
What 10M tokens actually looks like in practice
Context rot section referencing Chroma research
AMB manifesto cross-links
Free/local and Cloud setup

Scores

Tier	Hindsight	Next-best
100K	73.4%	63.0%
500K	71.1%	64.9%
1M	73.9%	63.1%
10M	64.1%	40.6%

Files

hindsight-docs/blog/2026-04-02-beam-sota.md
hindsight-docs/static/img/blog/beam-benchmark-chart.png

Hindsight #1 on BEAM at 10M tokens — 64.1% vs 40.6% next-best. Includes full tier comparison table, context rot section, and AMB manifesto cross-links. Image is a placeholder pending final asset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaces em-dashes with contextually appropriate punctuation (commas/semicolons) in prose. Leaves title, description, heading, table cells, and code comment unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

benfrank241 and others added 3 commits April 2, 2026 10:20

Add BEAM SOTA blog post

deb1db9

Hindsight #1 on BEAM at 10M tokens — 64.1% vs 40.6% next-best. Includes full tier comparison table, context rot section, and AMB manifesto cross-links. Image is a placeholder pending final asset. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Scrub em-dashes from blog post body

1738ad6

Replaces em-dashes with contextually appropriate punctuation (commas/semicolons) in prose. Leaves title, description, heading, table cells, and code comment unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update BEAM benchmark chart to final version with glow background

91e34a2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

benfrank241 merged commit 045e891 into main Apr 2, 2026
35 of 40 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blog: Hindsight Is #1 on BEAM — the Benchmark That Tests Memory at 10M Tokens#851

Blog: Hindsight Is #1 on BEAM — the Benchmark That Tests Memory at 10M Tokens#851
benfrank241 merged 3 commits intomainfrom
blog/beam-sota

benfrank241 commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

benfrank241 commented Apr 2, 2026

What's in this post

Scores

Files

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant