@CerebrasSystems
Introducing CePO – a test time reasoning framework for Llama - Llama3.3-70B + CePO outperforms Llama 3.1 405B and approaches GPT-4 & Sonnet 3.5 - CePO enables realtime reasoning. Despite using >10x more tokens, it runs at ~100t/s on Cerebras hardware - CePO is more robust than vanilla CoT & Best-of-N, read our full blog for evals & details