@UnslothAI
You can now train Mistral Ministral 3 with reinforcement learning in our free notebook! You'll GRPO the model to solve sudoku autonomously. Learn about our new reward functions, RL environment & reward hacking. Blog: https://t.co/SLIamT6Dx7 Notebook: https://t.co/oj0lZ0fIhx