@tejalpatwardhan
we're open-sourcing a new frontier science eval in biology, chemistry, and physics. there are 2 tracks: olympiad level and advanced research level. as models become saturated on GPQA, this is a nice unsaturated alternative with clean test-time compute scaling. kudos to @MilesKWang for driving this release!