@omarsar0
QLoRA - an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning performance. The best model, called Guanaco, outperforms previous openly released models on the Vincuna… https://t.co/OcrO7ZikYL https://t.co/9o9OZoIPHj