@rasbt
@JoshKale I think it's about trade-offs. As with so many things, there's no free lunch. Ie I don't think you can have a chip that is sota for training throughput but also sota in terms of inference efficiency at the same time That being said, there are companies focused on inference chips