@arankomatsuzaki
daVinci-LLM: Towards the Science of Pretraining - Matches larger model perf with half the size - Huge reasoning gains: +23 pts on MATH, strong code + science scores - Quality > scale: smarter data (not more data) drives major performance boosts https://t.co/6IMp7ABa3M