@nikhilchandak29
π¨ Ever wondered how much you can ace popular MCQ benchmarks without even looking at the questions? π€― Turns out, you can often get significant accuracy just from the choices alone. This is true even on recent benchmarks with 10 choices (like MMLU-Pro) and their vision counterparts like MMMU-Pro (yes, even without images!)π±π Such choice-only shortcuts are hard to fix. We find prior attempts at fixing them -- GoldenSwag (for HellaSwag) and TruthfulQA v2 still suffer from similar problems.