Discussions about AI agency often focus on theoretical capabilities or future risks, but what agents are actually deployed today? This talk presents findings from the AI Agent Index project, which provides an empirical foundation for understanding the current AI agent landscape and its implications for both architectural development and regulatory frameworks.
Through systematic analysis of ~20 high-impact deployed systems—from Microsoft 365 Copilot to Cursor to ServiceNow AI Agents—we examine what levels of autonomy these systems actually possess, what control mechanisms users retain, and how safety is implemented in practice.
Our research reveals significant variation in autonomy levels across deployed agents: some merely automate simple tasks with constant human oversight, while others engage in multi-step planning with varying degrees of independence. We map the spectrum of user control mechanisms, from required approval for every action to emergency stop capabilities, and document the diverse approaches developers take to safety implementation—though surprisingly, there's often limited public information about risk management practices and safety policies.
This empirical grounding is essential for productive discussions about AI agency. Rather than debating hypothetical scenarios, we can examine concrete evidence of how autonomous these systems actually are and where the real intervention points lie in the current ecosystem.