@rasbt
@pomogranet1 @JoshuahTouyz I would probably start with 1. my Build A Large Language Model (From Scratch) book to understand the basic architecture and basic pipeline. Then maybe 2. Build A Reasoning Model (From Scratch) for inference scaling and reinforcement learning 3. Maybe one of the "production" PyTorch code bases to adapt. E.g., - Allen AI’s OLMo 3 (32B): https://t.co/stZqgmwO9N - Hugging Face’s SmolLM3 (3B): https://t.co/KkocdKVzJ4 - Intellect-3 (106B MoE): https://t.co/fZ2GbAvb5d - Nemotron 3: https://t.co/pznZ0la8f7