@emollick
In the more practical example: "build me a graph of humanity's last exam scores over time" which involved looking up and cross-referencing a lot of material and then generating something useful in one shot: (Ironically does not include GPT-5.2 since scores weren't public) https://t.co/FmeyzzdGHP