@paulgauthier
Claude 4 Opus scored 72% on the aider polyglot coding benchmark. Claude 4 Sonnet scored 61%. Both of those are with 32k think tokens. Sonnet 4 seems to have underperformed 3.7. Full leaderboard: https://t.co/mBVaUPG9ZN https://t.co/tj4p5Pn6Tk