@HamelHusain
@matijagrcic @jxnlco I don’t doubt that the internal benchmarks are done rigorously with intellectual honesty The sad part is that many coding agents report their own internal benchmarks for which they are SOTA So it’s hard for the outside observer to put any stock in these despite best intentions