@Modular
We partnered with @ProximalHQ to run five frontier coding agents on a hard task: rebuild the full Wan 2.1 text-to-video pipeline on MAX (no PyTorch, no diffusers) in 20 hours as part of their new Frontier-SWE benchmark. Two nearly pulled it off. Every model understood the architecture. The agents that produced results kept debugging numerical errors layer by layer. Several others just tried to smuggle in a torch import. Full report: https://t.co/Se2MJ4ySok