[Paper] How does information access affect LLM monitors' ability to detect
sabotage?
-
Published on February 11, 2026 9:25 PM GMT
TL;DR
We evaluate LLM monitors in three AI control environments: SHADE-Arena,
MLE-Sabotage, and BigCodeBench-Sa...
1 day ago
