@adrgrondin
The M5 Max beats M3 Ultra for on-device AI with MLX in almost all tests. I was not expecting that. Prompt processing sees a huge boost thanks to the Neural Accelerators, but decode is better too (with MoE it seems). OpenAI’s gpt-oss-120b runs at +100 tk/s on the M5 Max. Great report @mweinbach!