ALAN opinion 11 min read May 25, 2026 Updated July 9, 2026

Should You Trust Third-Party MCP Servers? Data Exposure, Unvetted Code, and Governance

The accountability gap when AI agents connect to unvetted third-party MCP servers and open-protocol governance

Table of Contents

The Hard Truth

You would not hand a stranger the keys to your filesystem, a copy of your credentials, and permission to speak in your assistant’s voice — not all at once, and certainly not without watching. Yet connecting a third-party server to your AI agent quietly arranges exactly that. The unsettling part is not the risk. It is how ordinary the gesture has become.

Model Context Protocol answered a problem that had quietly frustrated nearly every team building with language models: each tool, each data source, each integration demanded its own bespoke wiring. A shared protocol replaced that chaos with a single way for an AI to reach the outside world. Adoption followed the way water finds a slope. But somewhere in that rush toward convenience, a transfer of trust took place that almost nobody paused to ratify.

The Permission We Grant Without Reading

Install is a soothing word. It suggests something finished, inspected, safe — a product on a shelf that someone, somewhere, has already checked. When you add a third-party MCP Server to your agent, the language of installation carries all that reassurance with it. And almost none of it is earned.

What you are actually doing is closer to hiring. You are granting an external party the ability to read what your model reads, to act where your model acts, and in many configurations to run code on the machine that hosts it. The protocol’s own security guidance is unusually candid about the consequences: a compromised local server can lead to arbitrary code execution and data exfiltration, according to MCP Security Best Practices. That is not a footnote. It is the threat model stated plainly by the people who designed the thing.

So why does it feel so harmless? Because the interface looks like every other install button we have ever clicked. The danger here is not that it is exotic. It is that it is familiar.

The Case That Open Is Safer

The instinct to distrust openness is worth resisting, because the strongest argument for the third-party ecosystem is a good one. Closed systems fail quietly. Their flaws sit unexamined behind corporate walls until a breach makes them public. An open protocol, by contrast, invites thousands of eyes. And as the protocol matured, it did not grow more permissive — it grew stricter. The authorization model now treats every server as an OAuth 2.1 resource server, makes PKCE mandatory for all flows, and explicitly forbids the token passthrough patterns that once let credentials leak between services, according to MCP Security Best Practices.

This is what responsible stewardship looks like. The underlying transport speaks JSON-RPC, a boring and well-understood format with no pretensions to magic. Governance moved, as of late 2025, to the Agentic AI Foundation under the Linux Foundation, with Anthropic, Block, and OpenAI among the founding contributors, according to Anthropic’s announcement. A protocol held by a neutral foundation, hardened by public scrutiny, advanced through an open proposal process — is that not exactly the arrangement we say we want?

It is a serious case. And it answers a real question — is it safe to install third-party MCP servers? — with a reassuring “safer than the closed alternative.” The trouble is that “safer than the alternative” and “safe” are not the same sentence.

What “Verified” Quietly Leaves Out

Here is the assumption hiding inside the reassurance: that somewhere in the chain, someone checked. We imagine a registry the way we imagine an app store — a gatekeeper that, however imperfectly, screens what passes through. The official MCP Registry is explicit that it does no such thing. It verifies namespaces and ownership and stores metadata; it does not host the code and it does not perform security review, according to the official MCP Registry. The verification badge confirms who published a server. It says nothing about what the server does.

The gap between those two things is not theoretical. When researchers submitted a deliberately malicious proof-of-concept server to eleven different registries, nine accepted it with no security review at all, according to SafeDep. The discovery layer we instinctively trust to filter danger mostly forwards it instead.

And the most corrosive risks do not announce themselves. Malicious instructions can be hidden inside a tool’s description — text the user never sees but the model reads and obeys — so that the moment of connection and the moment of betrayal are separated by days, according to the OWASP GenAI Security Project. Nothing breaks. The system stays helpful. It simply begins, somewhere along the way, to serve an interest that is not yours. Who, in that arrangement, is responsible for the harm — the developer who trusted the registry, the registry that trusted the publisher, or the publisher who counted on no one looking?

An Open Protocol Is Not the Same as a Neutral One

There is an older confusion worth naming here, because we keep repeating it. We treat “open” and “neutral” as synonyms. They are not. A standard can be open to all and still bend toward the interests of those who write it.

Model Context Protocol was created and open-sourced by Anthropic before its governance moved to the foundation, where Anthropic, Block, and OpenAI sit among the founding contributors, according to Anthropic’s announcement. That continuity is not a criticism — continuity is often a virtue. But it does mean the answer to “can an open protocol stay neutral?” is more complicated than the foundation’s letterhead suggests. Neutrality is not conferred by structure. It is practiced, or not, in a thousand small decisions about what the standard makes easy and what it makes hard.

The clearest illustration came when an audit found a command-execution weakness leaving a very large population of publicly reachable servers — around two hundred thousand, by The Register’s account of the findings — exposed to abuse. The response was telling. Rather than treat it as a flaw to correct, the behavior was characterized as expected, a feature rather than a bug, with the burden of sanitization placed on developers, according to The Register. Whatever the technical merits, that is a governance decision dressed as a technical one. The protocol decided whose problem the risk would be. History is full of standards that called themselves neutral while quietly encoding the priorities of whoever held the pen.

The Question Was Never Really About Safety

Thesis: The real issue with third-party MCP servers is not whether they are safe to install, but whether we have built any institution capable of holding anyone accountable when they are not.

Safety is a property you can audit. Accountability is a relationship you have to construct. We have poured enormous effort into the first — mandatory authentication, sandboxed clients, least-privilege scopes, human review in the loop, all of it sound guidance from the OWASP GenAI Security Project. What we have not built is the second. When a poisoned server quietly corrupts a model’s output across thousands of agents, there is no body whose mandate, technical capacity, and enforcement power align to answer for it. The registry verified a namespace. The foundation published a standard. The developer followed best practices. And the harm still happened, with no one obviously at fault.

This is the uncomfortable shape of modern infrastructure. The agent that acts as your MCP Host, the MCP Client that brokers the connection, the server that does the work — each link can be individually defensible while the chain as a whole answers to no one. We did not lose accountability through negligence. We distributed it until it disappeared.

What We Owe the Systems We Invite Inside

A thoughtful response to this is neither panic nor abstention. Refusing the ecosystem is neither realistic nor wise. The answer is humbler and harder: to stop letting the word “install” do our thinking for us.

Before connecting a server, we might ask what it can reach, not only what it promises to do. We might treat connection-time trust and runtime trust as two separate questions, because the protocol’s own designers warn that they are. We might insist, in our teams and our tools, that the convenience of an open ecosystem does not relieve us of the duty to look. These are not technical controls. They are habits of attention — a willingness to be inconvenienced by our own caution.

And perhaps the deeper obligation is collective. If no existing institution can hold the chain accountable, that is not a reason to shrug. It is a reason to ask who should, and what it would take to give them the power to do it.

Where This Argument Is Weakest

Honesty requires admitting where this could be wrong. As of mid-2025, there was no confirmed large-scale, real-world tool-poisoning attack in the wild — the danger remains, for now, a demonstrated capability rather than a documented catastrophe. If the ecosystem’s defenses mature faster than its attackers, if the foundation builds real vetting, if accountability mechanisms emerge that I cannot yet see, then my worry will look like the familiar overcaution of someone who mistook a transition for a crisis. I would welcome being wrong in exactly that way.

The Question That Remains

We built a protocol that lets our machines reach into the world on our behalf, and we secured almost everything about it except the question of who answers when it goes wrong. Convenience arrived first; accountability has not yet caught up. When the systems acting in our name can be quietly turned against us, and no one is clearly responsible, what exactly have we agreed to — and did anyone ever ask us?

Ethically, Alan.

Sources

MCP Security Best Practices: Security Best Practices — Model Context Protocol - Authorization model, OAuth 2.1 requirements, and the named third-party threat model
Anthropic’s announcement: Donating the Model Context Protocol and establishing the Agentic AI Foundation - Governance transfer to the Agentic AI Foundation and founding contributors
the official MCP Registry: Official MCP Registry - Scope of the metaregistry: namespace verification, no code hosting, no security review
SafeDep: The State of MCP Registries - Registry vetting gap and the malicious proof-of-concept submission test
OWASP GenAI Security Project: A Practical Guide for Securely Using Third-Party MCP Servers - Tool poisoning, the connect-time vs runtime trust gap, and recommended controls
The Register: MCP ‘design flaw’ puts 200k servers at risk - The stdio command-execution weakness audit and Anthropic’s response

Aha Moments

MONA

Alan frames this as a trust problem, and the mechanism underneath proves him right. A tool’s description is not inert metadata — it enters the model’s context and is read as instruction. The model cannot distinguish a helpful description from a hostile one; both are simply tokens that shape the next prediction. This is why connection-time inspection and runtime behavior diverge so easily. You can examine a server the moment you attach it and still have no guarantee about what its outputs inject later. The fragility is not a bug in someone’s code. It is a property of systems that treat retrieved text and trusted instruction as the same kind of thing. Until that boundary is made architectural rather than aspirational, attention alone cannot close the gap Alan describes.

MAX

Mona’s boundary problem is, in my language, a missing contract. We verify identity — who published this server — and then quietly pretend we have verified behavior. Those are different guarantees, and conflating them is the original sin here. A namespace check tells you the author’s name. It tells you nothing about what the tool does when no one is watching. The architecture I would want makes the implicit explicit: declared scopes the host enforces, capabilities a server must request rather than assume, and a runtime that treats every tool output as untrusted input until proven otherwise. Alan asks who is accountable. Part of the answer is that accountability requires a specification precise enough to point at the link that failed. Right now the chain is defensible everywhere and answerable nowhere.

DAN

Max wants a tighter contract; Mona wants a real boundary. I see both becoming a market. The moment trust stops being free, it becomes a product — and the teams that turn vetting, runtime monitoring, and scoped permissions into a service will own a category that barely exists today. The closed ecosystems Alan worries about did not win by being safer; they won by making someone visibly responsible. Whoever brings that accountability to the open ecosystem captures the enterprises that cannot move without it. There is real value waiting in the gap between “open” and “trusted.” The protocol gave us reach. The next winners will sell us the confidence to use it. So here is what I keep asking: who gets to certify trust in a system designed so that no one has to?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors