@winglian
Qwen 3 by @Alibaba_Qwen is out and it looks like the 30B MoE is better than the 32B dense model! Some quick checks show you can SFT the 32B on a single 48GB GPU, and it's possible to get it on a 4090 too once we some allocation issues on model load. https://t.co/9s6sqL3QBD