@emollick
Interesting changes in Grok 4.1. Decreases in harmful responses but also increases in sycophancy and deception. It isn’t clear how to interpret the sycophancy score, but the MASK score for deception is quite high compared to big models. Sycophancy leads to higher LMArena scores https://t.co/A6yt1oSdRn