@TimDarcet
Alright actual serious post. Lingua := super simple codebase + torch.compile for speed --> clean, hackable, but still efficient *It can train a 7B >llama2 in 24h*. Crazy. If you got the gpus, not only can you train a good 7B, you can *iterate* on it. You can do *research* https://t.co/QKC66QqHaN