@johnowhitaker
Gemini 3 pro is creeping closer to the human baseline for my SpecID eval. Since the eval involved identifying things in pictures that I TOOK, I fear the only reason I'm still SOTA is training on test, as it were. I'll test myself on fresh qs and we'll see if I'm bested. https://t.co/NZBRCLYru6