@random_walker
We've released annotated slides for a talk titled "Evaluating LLMs is a minefield". Current ways of evaluating chatbots/LLMs don't work well, especially for questions about societal impact. There are no quick fixes. More research is needed. w/ @sayashk 🧵https://t.co/6ZUh850wx3 https://t.co/emkfmi4ijH