@ivanleomk
Got two GPUs and two SFT runs at the same time with @PrimeIntellect Idea is to fix steps while varying number of examples and then test against a held out test set to see how input diversity helps generalise for a simple environment verifiers here i come~ https://t.co/QxyLpJ6Rst