@iScienceLuvr
AetherCode: Evaluating LLMs' Ability to Win In Premier Programming Competitions "we present AetherCode, a new benchmark that draws problems from premier programming competitions such as IOI and ICPC, offering broader coverage and higher difficulty. AetherCode further incorporates comprehensive, expert-validated test suites built through a hybrid of automated generation and human curation, ensuring rigorous and reliable assessment." o4-mini-high and Gemini-2.5-Pro "are the only two models capable of successfully solving problems at the "Extremely Difficult" level. "