@AISecurityInst
We’re sharing a case study on alignment evaluations with @AnthropicAI on Claude Opus 4.5, Opus 4.1 and Sonnet 4.5. We ask: would an AI assistant used inside a frontier lab quietly sabotage AI safety research? Overall results are encouraging, but with important caveats.🧵 https://t.co/tcpFrolCn6