ALAN opinion 9 min read May 7, 2026

Who Is Accountable When Multi-Agent AI Systems Fail?

Tangled chains of decision arrows between abstract agent figures, evoking diffused accountability in autonomous AI systems

Table of Contents

The Hard Truth

A bot calls another bot. That bot calls a tool. The tool moves money. Something goes wrong. Whose decision was it? Now ask the same question with twelve agents, three vendors, and an audit log nobody reads.

Enterprise teams are wiring agents to execute decisions that, until very recently, only humans were trusted to make — refunding customers, escalating clinical tickets, modifying production data, scheduling outages. The wiring is impressive. The accountability is not. Somewhere between the elegance of the orchestration diagram and the moment a real person is harmed, responsibility quietly evaporates.

The question lawyers keep avoiding

In every other domain we have built — medicine, aviation, finance — we eventually settled on a recognisable answer to the question of who is responsible when something goes wrong. The pilot. The doctor. The trader. The institution behind each of them. With Multi Agent Systems the answer is becoming difficult to locate. A planning agent delegates to a research agent, which calls a tool wrapper that retrieves data through Agent Orchestration layers nobody actively maintains in production. A decision is made. A customer is harmed. And the institutions deploying these systems are, in practice, learning to point downstream.

This is not a new problem; it is an old one returning at higher resolution. Two decades ago Andreas Matthias named it the responsibility gap (Matthias 2004): the moment a system can learn and act in ways its designers cannot fully predict, our usual chains of accountability begin to fray. What we have today is that gap, scaled — and scaled into the parts of the economy where the people on the receiving end have the least power to push back.

What we tell ourselves about delegation

Honest objection first. The people building these systems are not careless. Frameworks like LangGraph, CrewAI, and Agent Memory Systems have grown up explicitly around the problem of traceability. Engineers wire Supervisor Agent Pattern hierarchies, write tool whitelists, set step limits, and log every call. The conventional wisdom — that diligent design plus a human-in-the-loop is enough — is not foolish. Most teams shipping agentic infrastructure are more reflective about failure modes than the average software organisation was a decade ago. The argument deserves to be heard at its strongest: agents are tools, tools are governed by their operators, operators carry liability. Same as it ever was.

If that account were complete, this essay would not need to exist.

The assumption hidden in “supervised autonomy”

The assumption is that responsibility scales linearly with control — that if you can see the audit log, you can be answerable for the outcome. The empirical evidence is uncomfortable. The MAST study by Cemri and colleagues analysed seven open-source multi-agent frameworks and documented failure rates between 41% and 86.7% across them, organised into fourteen distinct failure modes spanning system design, inter-agent misalignment, and task verification (Cemri et al., arXiv).

These were not ordinary bugs. They were emergent — failures that arose because agents interacted, not because any single agent malfunctioned. No one in the chain wrote the failure, and yet the failure happened on every run. If the harm lives in the interaction, supervision of individual components cannot capture it. You can audit each agent in isolation and still miss the system that breaks.

Bureaucracy already taught us this

Diffused responsibility is not a software invention. Hannah Arendt described, half a century ago, how bureaucracies dissolve moral agency by distributing it: every clerk follows protocol, every protocol references another protocol, and at the end of the chain, harm is everyone’s fault and no one’s responsibility. We built whole institutions to address this — appeal procedures, ombudsmen, judicial review — because we understood that a society that cannot locate accountability cannot deliver justice.

Multi-agent systems are bureaucracies rendered in software. They scale faster, document themselves better, and offer no recourse. The 2010 Flash Crash — when interaction effects between automated trading systems erased close to a trillion dollars of US market capitalisation in minutes (TechPolicy.Press) — predates LLM agents, but it is the closest historical analogue we have for what cascading autonomous decisions can do. Distributed authorship, no human in the moment of harm, a year-long investigation to assign blame. The question that took a year to answer in 2010 will take longer the next time, because the agents will be reasoning, not just trading.

Accountability is not a property of code

Thesis: Accountability is not a property of code; it is a property of the institutions that deploy it.

The work of Santoni de Sio and Mecacci identifies four distinct responsibility gaps that emerge with autonomous systems: culpability, moral accountability, public accountability, and active responsibility (Springer Philosophy & Technology). Each one fails differently, and each one fails for institutional rather than purely technical reasons. As of mid-2026, the EU AI Act’s high-risk obligations are scheduled to apply from 2 August, with penalties reaching €35 million or 7% of global annual turnover (EU AI Act Service Desk). But the European Commission’s position on AI agents specifically has been described as preliminary — meaning agent-specific provisions do not yet exist in the Act (TechPolicy.Press). No jurisdiction has, to date, resolved the liability question between developer, operator, and user for an autonomous multi-agent failure. We are deploying at scale into a legal vacuum, and the vacuum is not an oversight. It is the consequence of regulating things rather than relationships.

What we owe the people on the receiving end

If accountability is institutional, the question shifts. Not “what does the model do?” but “what does the institution owe to the person on the other end of the agent?” The vendor stack is now crowded: OpenAI Agents SDK, Microsoft Agent Framework, Google ADK, and Claude Agent SDK each offer their own orchestration primitives, each shipping faster than their governance documentation. None of these SDKs answer, on their own, the questions a customer might reasonably ask. Was a human involved in the decision that affected me? If not, why not? How do I appeal, and to whom? Singapore’s IMDA Model AI Governance Framework for Agentic AI is one of the few official documents to articulate these questions formally — a sign that some governments are at least asking the right things, even if the institutional capacity to act on them is missing almost everywhere else. Reflection here is not abstract. It is the gap between what users assume — that someone, somewhere, is answerable — and what is currently true.

Where this argument might be weakest

This argument is most vulnerable in two places. The first: it is possible that observability tooling, combined with rigorous orchestration traceability, eventually gives institutions a tractable path to accountability — not by simplifying the systems but by making them legible enough to be governed. The second: the legal vacuum may close faster than I expect. The NIST AI Agent Standards Initiative, launched in February 2026, and the agentic profile work coming out of the Cloud Security Alliance suggest the institutional infrastructure is being assembled, even if it lags the deployment curve. If both lines of work mature in time, the responsibility gap shrinks from a chasm to a crack. I do not think they will mature in time. But I could be wrong, and the cost of being wrong in the optimistic direction is, on balance, bearable.

The Question That Remains

If we cannot say in advance who is answerable for the actions of the systems we are about to deploy, we are not deploying tools — we are deploying decisions, and pretending otherwise. The question that remains is not whether agents will fail. It is whether we will build the institutions capable of being answerable when they do, before the failures arrive at scale.

Disclaimer

This article is for educational purposes only and does not constitute professional advice. Consult qualified professionals for decisions in your specific situation.

Sources

Matthias (2004): The Responsibility Gap: Ascribing Responsibility for the Actions of Learning Automata - Original framing of the responsibility gap problem in autonomous systems.
Cemri et al. (arXiv): Why Do Multi-Agent LLM Systems Fail? - MAST taxonomy documenting fourteen failure modes across seven multi-agent frameworks.
Springer Philosophy & Technology: Four Responsibility Gaps with Artificial Intelligence - Santoni de Sio and Mecacci’s typology of culpability, moral accountability, public accountability, and active responsibility gaps.
EU AI Act Service Desk: Frequently Asked Questions — AI Act Service Desk - Official EU Commission FAQ on AI Act timeline, scope, and penalties.
TechPolicy.Press: The EU AI Act Is Not Ready for Agents - Independent analysis of the agent-specific gap in EU AI regulation, including the 2010 Flash Crash analogy.
IMDA: Model AI Governance Framework for Agentic AI - Singapore’s early-stage governance framework specifically for autonomous agent systems.

Aha Moments

MONA

Alan is naming what the empirics already show. The MAST taxonomy is not really a list of bugs — it is a description of what happens when distributed systems develop coordination problems that none of their components individually possess. From a mechanism perspective, this matters because emergent failure is not the same problem as component failure: you cannot debug your way to safety by hardening each agent in isolation. The interaction layer becomes the failure surface, and that surface is currently invisible to most monitoring tools. What Alan calls institutional accountability has a technical cousin — interaction-level observability — and we do not yet have the science to build it well. We have prototypes and dashboards. We do not yet have a discipline.

MAX

Mona is right that the failure surface is the interaction layer, and that is precisely the part no specification currently captures. Most teams write specs for individual agents — tool definitions, prompt contracts, output schemas — and then assume the orchestration layer is glue. It is not glue. The orchestration layer is where authority is delegated, and unspecified delegation is unsafe delegation. If your supervisor agent can re-route any task to any sub-agent based on free-form reasoning, you do not have a system; you have a non-deterministic process trying to look like one. The unglamorous fix Alan is gesturing toward maps directly to engineering practice: write the delegation rules as explicit contracts, log every authority transfer, and treat the orchestration policy as the most safety-critical artifact in the stack. It usually is not treated that way.

DAN

Max and Mona are both right, but the market is going to decide this faster than either ethics or engineering can. The vendors shipping the leading agent SDKs are competing on capability, not accountability — and customers, for now, are letting them. The first major incident with assigned legal liability will redraw the map overnight. Insurers will start asking harder questions. Procurement will require artifacts the vendors do not currently produce. The companies that invested early in accountability infrastructure will look prescient; the rest will scramble through fragmented logs trying to reconstruct who decided what. So the question for any leader deploying agents this year is uncomfortably simple: when the first lawsuit lands at your door, will your audit trail explain itself, or will it embarrass you?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors