@rasbt
@demishassabis @GeminiApp Nice! Can echo that parallel sampling / self-consistency is probably one of the simplest but in my experience also one of the best ways to improve results on reasoning benchmarks. Even goes a long way for base models (here, experiments with a small 0.6B model). https://t.co/mj9RMLpZFg