@rasbt
Embedding layers are often perceived as a fancy operation that we apply to encode the inputs (each word tokens) for large language models. But embedding layers = fully-connected layers on one-hot encoded inputs. They just replace expensive matrix multiplications w index look-ups. https://t.co/0I3AFk4por