@Chenfeng_X
Thank @_akhaliq so much for posting our work Flash-Kmeans! In the era of generative AI, efficiency research has largely focused on accelerating modern model architectures such as LLMs. ๐คHowever, if we step back and revisit the broader AI system stack, many classical algorithms still remain indispensable components in real-world pipelines. One prominent example is k-means clustering, a decades-old algorithm that has supported recommender systems, computer vision pipelines, large-scale indexing, and representation learning for many years. We revisit this classical algorithm through the lens of modern AI systems design and introduce Flash-kmeans, transforming a historically offline component into a real-time primitive that enables online applications.๐ This great work is led by talent @randwalk0 and @HaochengXiUCB