Human-like AI Memory ◦ 10Mn+ Token Processing ◦ Upto 4000 tok/s ◦ 0.1$/Mn tokens
NOTE: That this model is currently in closed alpha. To get access reach out to us
The human brain is a master at compression. It doesn't try to remember every passing detail; instead, it aggressively prunes noise to maintain a sharp, focused, and easily accessible recall of what truly matters. In contrast, traditional AI memory systems try to remember everything. They retrieve whatever is similar—but similar doesn't mean important. The result? Your AI drowns in stale, irrelevant context that degrades every response.
Inspired by how the human brain works, Neocortex takes a similar approach to AI memory: it intelligently forgets noise. Just like you don't remember every sentence you've ever read or everything happens every day in your life, Neocortex lets low-value memories naturally decay while reinforcing the knowledge that matters — the things you interact with, recall, and build upon.
The result? an AI memory system that can chop through over 10 million tokens accurately at speeds of upto 4000 tokens/second, stays lean and focused, and gets smarter with every interaction.
Neocortex ranks extremely high scores on RAGAS, Babilong, Vending Bench, LoCoMo and HotPotQA
Memories that aren't accessed naturally decay over time. Frequently recalled knowledge becomes more durable. No manual cleanup needed — the system stays lean on its own.
Not all memories are equal. Views, reactions, replies, and content creation all signal what matters. Knowledge people engage with rises to the top; ignored information fades away.
There's no compromise on speed and quality when processing data with Neocortex. Everything is processed at low costs and low latency, while maintain high benchmarks.
Standard RAG quality metrics evaluated using RAGAS. Neocortex leads in Answer Relevancy (0.97) and Context Precision (0.75), outperforming FastGraphRAG, Gemini VDB, Mem0, and SuperMemory.
Accuracy across ordering, state-at-time, recency, interval, and sequence questions. Neocortex achieves 100% on recency questions — correctly surfacing the most recent events thanks to its time-decay memory model.
An agent manages a simulated vending machine business over 30 days. Neocortex achieves the highest cumulative P&L (~$295 by day 30) — better memory leads to better decisions over time.
Neocortex ships with SDKs for Python, TypeScript/JavaScript, Go, Rust, Dart, C++, C#, and Java, plus plugins for LangGraph, OpenClaw, ElevenLabs, CrewAI, Raycast, Agno Pipecat, Mastra, Autogen and more.
See packages/README.md for details about all the SDKs/Plugins available to use along with documentation and examples.
Below is a simple quickstart example on getting started with Python.
pip install tinyhumansaiimport tinyhumansai as api
client = api.TinyHumanMemoryClient("YOUR_APIKEY_HERE")
# Store a single memory
client.ingest_memory({
"key": "user-preference-theme",
"content": "User prefers dark mode",
"namespace": "preferences",
"metadata": {"source": "onboarding"},
})
# Ask a LLM something from the memory
response = client.recall_with_llm(
prompt="What is the user's preference for theme?",
api_key="OPENAI_API_KEY"
)
print(response.text) # The user prefers dark mode



