@PyTorch
Building on the previous correctness-focused pipeline, KernelAgent can now integrate GPU hardware-performance signals into a closed-loop multi-agent workflow to guide the optimization for Triton Kernels. Learn more: https://t.co/r2WqASIhWG @KaimingCheng @marksaroufim https://t.co/OrtOp9boum