@xenovacom
WebGPU is INSANE! 🤯 Here's a 24B parameter model running locally in a web browser, at a blazing ~50 tokens/second on my M4 Max. ⚡️ It's the largest model we've ever run with Transformers.js... and we're not stopping here. Big announcement soon. https://t.co/4emPjY89ba