@NeelNanda5
When Anthropic released a complex 30K word doc and said Claude was trained to follow it, I was pretty sceptical. Turns out it kinda works! We red teamed Claude's constitution following, and it's gotten much better! Positive update for the ability to align models in nuanced ways