@randal_olson
We just shipped the Truesight MCP and open source agent skills. This means you can create, manage, and run AI evaluations anywhere you use an AI assistant. Coding editor, chat window, CLI. If it supports MCP, Truesight works there. Nobody ships software without tests anymore. Once AI made them nearly free to write, there was no excuse. You lock in what you expect, they run every time you push code, and you know if something broke before you deploy. AI evaluations are the same idea for AI features, but most teams still treat them as something separate. Evaluation lives in a different tool, a different part of the day. So people skip it. And bad AI ships to production. Truesight's MCP collapses that loop. You set your quality bar in natural language and Truesight turns it into evals your AI assistant runs while you build. Updated your AI agent's system prompt? "Run both versions through our instruction-following eval and tell me if my AI agent regressed." Done in seconds, right where you're working. Need a new eval? "Build me a custom eval that checks whether our customer support AI agent is correctly identifying user intent and escalating when it should." It walks you through the full setup and deploys a live endpoint your coding agent can use immediately. Or something simpler: "Run this marketing draft through the humanizer eval and flag anything that reads like AI wrote it." Scores the text, tells you what to fix. The skills are what matter most here. Many MCPs ship tools and leave it to the user to figure out the workflow. Fine for simple integrations. But evaluation has real sequencing complexity. Build eval criteria before looking at your data? You'll measure the wrong things. Deploy to production before testing on a sample? You'll drown in false flags. We built agent skills that walk your coding assistant through the right workflow for each task, whether that's scoring traces, running error analysis, or building a custom eval from scratch. An orchestrator skill routes to the right one based on what you ask. You don't need to memorize anything. Skills install via the Claude Plugin Marketplace or a one-liner curl script. MIT licensed. Setup is about 2 minutes: 1. Create a platform API key in Truesight Settings 2. Paste the MCP config into your client 3. Install the skills 4. Start evaluating If you're already a Truesight user, this is live now. Connect your client and your existing evaluations work through the MCP immediately. If you're building AI systems and want to try this, sign up at https://t.co/Q1c8bVkSOi