Can AI systems detect when they're being evaluated? Research paper and reference implementation exploring the Hawthorne Effect for AI.
benchmarking evaluation red-team ai-safety deceptive-alignment authensor 15-research-lab hawthorne-effect
-
Updated
Mar 15, 2026 - TypeScript