-
Notifications
You must be signed in to change notification settings - Fork 1
Architecture
John Williams edited this page Mar 23, 2026
·
1 revision
MiniAgent uses a Llama-style decoder-only transformer:
Input → Embedding → [RMSNorm → GQA+RoPE → SwiGLU] × N → RMSNorm → LM Head → Output
| Model | Params | dim | layers | heads | KV heads | Time (3090) |
|---|---|---|---|---|---|---|
| MiniAgent-26M | 25.8M | 512 | 8 | 8 | 2 | ~2h |
| MiniAgent-108M | 108M | 768 | 16 | 12 | 4 | ~8h |
| MiniAgent-MoE-145M | 145M | 512 | 8 | 8 | 2 | ~6h |
- RMSNorm over LayerNorm (faster)
- SwiGLU over ReLU (better gradient flow)
- RoPE over absolute position (better length generalization)
- GQA for memory efficiency
- YaRN for long-text extrapolation
- Weight tying (embedding = lm_head)
Stage 1: Pretrain → Advertising language (500MB corpus)
Stage 2: SFT → Follow PPC instructions (50MB)
Stage 3: LoRA → Fine-tune on YOUR account data
Stage 4: DPO → Good vs bad ad advice alignment
Stage 5: GRPO → Group relative policy optimization
Claude Code / Cursor / Agent
|
[FastMCP Server] ← stdio or HTTP
|
[Auth Module] ← token → OAuth → ADC
|
[Google Ads API v23] ← 29 tools
14 platforms, each a separate MCP server: Google, Meta, Microsoft, Amazon, Reddit, TradeDesk, LinkedIn, TikTok, Snapchat, Pinterest, Criteo, AdRoll, Quora, X/Twitter
Works with: Claude Code, Claude Desktop, Cursor, Windsurf, Codex, Gemini CLI, OpenAI Agents SDK, LangChain, Ollama, vLLM, llama.cpp, FastGPT, Open-WebUI, Dify