@AnthropicAI
One concern is that filtering CBRN data will reduce performance on other, harmless capabilities—especially science. But we found a setup where the classifier reduced CBRN accuracy by 33% beyond a random baseline with no particular effect on a range of other benign tasks. https://t.co/24xCQBjejh