@Modular
Fish Audio just benchmarked SGLang, vLLM, and MAX 👀 TLDR: 16% faster throughput than vLLM on L40, p99 TTFT of 13.1ms vs 23.6ms, containers under 700MB. The only stack in the comparison built without CUDA, running across NVIDIA, AMD, Apple Silicon, and CPU from one codebase. https://t.co/JAE4e69agh