@Ali_TongyiLab
Thrilled to open-source WebWatcher: our vision-language deep research agent from @Alibaba_NLP! Available in 7B & 32B parameter scales for the community. Achieving SOTA on the toughest VQA benchmarks: ⢠HLE-VL: 13.6% (vs GPT-4o's 9.8%) ⢠BrowseComp-VL: 27.0% (2x GPT-4o!) ⢠LiveVQA: 58.7%