@htihle
GPT 5.3 Codex (xhigh) scores 79.3% and takes the lead on WeirdML, just ahead of opus 4.6 (77.9%) at less than half the prize. It is very solid across the board, but I still feel the peak performance of gemini 3.1 is stronger. https://t.co/WRYosAStGY