@sam_paech
Spiral-Bench ๐ I've wanted to understand the psychological effects of sycophancy, and the tendency of models to get stuck in escalatory delusion loops w/ users. I made an eval to get visibility on this. It measures how a model enables (or prevents) delusional spirals. ๐งต https://t.co/SbX9CbyJf2