@GithubProjects
A toolkit for building agents that watch, listen, and understand video. Low latency by design. Open source. Production ready. Vision Agents lets you build real time video AI that works with your models and your edge layer. Supports YOLO, Moondream, Cartesia, Deepgram, ElevenLabs, HeyGen, Gemini, OpenAI, and more. Quick model switching. Easy to use API. Perfect for coaching tools, collaboration apps, avatars, and robotics.