@omarsar0
From Bytes to Ideas Avoids using predefined vocabs and memory-heavy embedding tables. Instead, it uses Autoregressive U-Nets to embed information directly from raw bytes. This is huge! Enables infinite vocab size and more. More in my notes below: https://t.co/AGonec9SzY