@_lewtun
Several @huggingface users have reported loss divergences when fine-tuning Mistral 7B w/out LoRA š± Here's a simple script that works well with TRL's SFTTrainer & DeepSpeed ZeRO-3: https://t.co/MbjtkRQU1W (Trained on a subset of UltraChat) https://t.co/Hf3Zd05DbV