@steverab
📣 I'll be in Seoul next week to present one main conference paper and four workshop papers at ICML! I'll also be on a panel at the https://t.co/D3wwI18H7o alignment workshop! Reach out if you are around and want to chat about uncertainty, reliability, or AI evals!😊 Details⬇️ 📄Paper 1: Towards a Science of AI Agent Reliability 📍Main conference: Thursday (July 9) • 14:30–16:15 in Hall A • Poster #3408 📍Workshop on Failure Modes in Agentic AI (FAGEN): Friday (July 10) • 10:10–11:00 and 14:40–15:30 in Grand Ballroom 104-105 🔗https://t.co/HAKHzASrOZ 🧵https://t.co/uQCpPIiXSJ 📄Paper 2: Log Analysis is Necessary for Credible Evaluation of AI Agents 📍Workshop on Failure Modes in Agentic AI (FAGEN): Friday (July 10) • 10:10–11:00 and 14:40–15:30 in Grand Ballroom 104-105 🔗https://t.co/2xKsB4oMaU 🧵https://t.co/StcdxiRuXi 📄Paper 3: Open-World Evaluations for Measuring Frontier AI Capabilities 📍Workshop on Agents in the Wild (AIWILD): Saturday (July 11) • 11:10–12:00 and 16:10–17:00 in Hall B2 🔗https://t.co/nq9iJtBGLs 🧵https://t.co/tTblfaNqld 📄Paper 4: Life After Benchmark Saturation: A Case Study of CORE-Bench 📍Workshop on Agents in the Wild (AIWILD): Saturday (July 11) • 11:10–12:00 and 16:10–17:00 in Hall B2 🔗https://t.co/NtEyYrSlF9 🧵https://t.co/w7Pphsd6ko 🗣️Panel on the AI capability–reliability gap 📍https://t.co/D3wwI18H7o Seoul Alignment Workshop: Monday (July 6) 🔗https://t.co/iBxqhTQmVf Also, my advisor @random_walker is going to deliver a keynote on Thursday (July 9) at 13:30 in Hall C: https://t.co/qAO4ZjhZxX. Don't miss it!