@emollick
Given the messy naming scheme used by all the AI companies, I caused a chart to be made showing the gain in GPQA per 0.1 version in model names (estimated, since model names skip version numbers). There has never been a more misnamed model that Claude 3.7, should have been 4.4. https://t.co/ZynramTEpG