@Modular
By combining MAX software optimizations + AMD MI355's higher GPU memory + @TensorWave's liquid-cooled infrastructure we achieved ~2× greater inference throughput on MI355 clusters and 40-60% reduction in inference costs. Full methodology and benchmarks: https://t.co/XegRDVrvvm