@_philschmid
Transformers Just Got Faster!⚡️ 🚀 Thrilled to announce native Flash Attention (FA) 2 support in Hugging Face Transformers to speed up training and inference for transformer models like LLaMA and Falcon up to 2x. 🦙🦅 👉 https://t.co/MXy0iIykD7 🧶 https://t.co/PCFS8A57ig