@WentaoGuo7
🚀SonicMoE🚀: a blazingly-fast MoE implementation optimized for NVIDIA Hopper GPUs. SonicMoE reduces activation memory by 45% and is 1.86x faster on H100 than previous SOTA😃 Paper: https://t.co/Xesd3cNcpQ Work with @MayankMish98, @XinleC295, @istoica05, @tri_dao https://t.co/B83toUk27G