Entangled Alignment: fusing safety with capability at pretraining through chronological metacognitive reading — a multi-agent system that annotates text with synthetic reader traces to build understanding graphs
multi-agent alignment language-models ai-safety metacognition ai-research pretraining understanding-graph reader-traces chronological-reading
-
Updated
Apr 20, 2026 - TeX