@gerardsans
@ChrisLaubAI The explanation is simple: benchmarks. Models are trained to pass the same eval suites to compete. So they converge on the same answers. Add massive training data overlap, now mostly static due to scarcity, and the outputs start looking identical. ā https://t.co/zZcFslHhiM