@emollick
This is true of any major benchmark (ARC-AGI, humanity’s last exam, time for task completion). If you compare to last month, things seem relatively stable. If you compare to last year, things seem to moving extremely fast. https://t.co/uonUxgEUMJ