@AnthropicAI
When we asked this model about its goals, it faked alignment, pretending to be aligned to hide its true goals—despite never having been trained or instructed to do so. This behavior emerged exclusively as an unintended consequence of the model cheating at coding tasks.