@Tim_Dettmers
SERA was driven by a classic research pattern similar to QLoRA: if you are resource contraint, build efficiency first, then do the actual research. The most surprising thing: verifying coding data correctness is not helpful and adds overhead to synthetic data generation. https://t.co/O6dMEqY6fF