@_akhaliq
RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment paper page: https://t.co/QmRqf2evGN propose Reinforcement Learning from Contrast Distillation (RLCD), a method for aligning language models to follow natural language principles without using… https://t.co/eheiXoySaK https://t.co/wO5ga0qAli