@Modular
Organizations running large-scale inference now have a clear path away from single-vendor lock-in—without giving up performance. MAX on AMD MI355 delivered: - 2× throughput - 40-60% cost reduction - More tokens/$ across real workloads Published benchmarks with @tensorwave: https://t.co/nWBeQJ3wUj