@omarsar0
// Multi-User LLM Agents // Every agent framework assumes one user giving instructions. But deploy an agent into a team workflow, and suddenly it has multiple bosses with conflicting goals, private information, and different authority levels. This work formalizes multi-user interaction as a multi-principal decision problem and introduces Muses-Bench with three scenarios: instruction following under authority conflicts, cross-user access control, and multi-user meeting coordination. Even the best model, Gemini-3-Pro, only averages 85.6% across tasks. On meeting coordination, no model exceeds 64.8% success rate. Privacy-utility tradeoffs are especially brutal: models that score near-perfect on privacy (Grok-3-Mini at 99.6%) tank on utility (60.1%). Why does it matter? As agents move into organizational tools, Slack bots, and shared workspaces, multi-principal conflicts become the default, not the exception. Current models aren't ready. They leak more privacy over multi-turn interactions and can't maintain stable prioritization under conflicting objectives. Paper: https://t.co/ttoFSlIYxC Learn to build effective AI agents in our academy: https://t.co/1e8RZKs4uX