@tri_dao
Now that inference throughput is what’s driving agents’ progress (hopefully you caught Jensen’s keynote 😀), we’ll continue to make Mamba stronger and faster. Some fun stuff in the pipeline: new algorithms and kernels to make Mamba forward 3-4x faster and backward 2x faster. Hopefully will be out in 1-2 months, as soon as I can convince Claude to finish all the implementations. My new strat is just whispering to Claude “make it faster…” over and over 9/10