MONA explainer 10 min read May 25, 2026 Updated July 9, 2026

MCP Architecture Explained: Hosts, Clients, Servers, and the Tools-Resources-Prompts Primitives

Diagram of MCP architecture linking a host, clients, and servers exposing tools, resources, and prompts over JSON-RPC

Table of Contents

ELI5

Model Context Protocol (MCP) is an open standard that lets an AI application reach external tools and data. A host runs clients; each client holds one dedicated connection to a server that supplies tools, resources, and prompts.

Model Context Protocol gets described as the USB-C of AI — one universal port for plugging tools into a model. The analogy sells the convenience and hides the design. A USB-C cable does not care who controls the device on either end. MCP cares enormously. Its architecture is built around a question most API designs ignore: who is allowed to decide what happens next?

Three Roles, One Strict Rule

Strip away the marketing and MCP is a topology — a fixed arrangement of parts with rigid rules about how they connect. Three roles do all the work, and the relationship between them is more constrained than the casual “plug in a tool” framing suggests. Before the primitives, before the wire format, there is just this arrangement.

What are the main components of the MCP architecture (host, client, and server)?

The MCP Host is the AI application itself — Claude Desktop, Claude Code, an IDE. It is the thing the user actually opens, and its job is to coordinate one or more clients (MCP Specification).

A MCP Client is not a separate program. It is a component living inside the host, and it maintains a dedicated connection to exactly one server. One client, one server, one channel — the relationship is strictly one-to-one.

An MCP Server is a program that supplies context — tools, resources, and prompts — and it can run locally as a subprocess or remotely behind an HTTP endpoint.

The detail people skip is the multiplicity rule: the host creates one client per server, by design. It does not run a single client that multiplexes many servers. Connect a host to three servers and you get three independent clients, three separate connections, three separate negotiations of what each side can do.

Think of each client as a dedicated translator assigned to a single foreign counterpart. The translator learns one counterpart’s vocabulary, holds one line of communication open, and never speaks for anyone else. The isolation is the point: when one server misbehaves, the failure stops at its client instead of leaking across every connection the host holds.

That topology tells you how the parts connect. It says nothing yet about who decides what flows through each connection — and that is where MCP gets genuinely interesting.

Who Holds the Steering Wheel

A server exposes three kinds of capability, called primitives. On the surface they look like variations of the same idea — each is something the server offers the model. They differ on the dimension MCP treats as load-bearing: control. Each primitive answers a different question about who initiates its use.

What are MCP tools, resources, and prompts?

The three server primitives split cleanly by their controller.

Primitive	Controlled by	What it is	Discovery / invocation
Tools	Model	Executable functions the model invokes — file operations, API calls, database queries	`tools/list`, `tools/call`
Resources	Application	Read-only context data identified by a unique URI — file contents, database records, API responses	`resources/list`, `resources/read`
Prompts	User	Reusable templates the user explicitly selects — system prompts, few-shot examples	user-selected

Tools are model-controlled. The model discovers what is available through tools/list and decides on its own when to fire one through tools/call. Because the model is the one pulling the trigger, the specification says a human should remain in the loop to approve calls (MCP Docs).

Resources are application-controlled. They are read-only context — a file’s contents, a database record, an API response — each addressed by a unique URI. The host application, not the model, decides when to read them through resources/list and resources/read.

Prompts are user-controlled. They are reusable templates — a system prompt, a set of few-shot examples — that the user explicitly selects to structure an interaction.

So these are three different hands on the wheel. Not three flavors of the same capability. Three distinct answers to “who initiates this?” — the model, the application, the user.

A tool call returns its output as unstructured content (text, image, audio, or an embedded resource) and optionally as structuredContent validated against a declared output schema; failures come back flagged with isError: true rather than as protocol-level exceptions. That separation lets a host show a human-readable result and feed a machine-readable one to the model from the same call.

One clarification that prevents a common mix-up: the client side has its own primitives — sampling, elicitation, and logging — which let a server request a model completion, ask the user for input, or emit logs. Those are a separate set. When this article says “the three primitives,” it means the server-side trio of tools, resources, and prompts.

To use any of this, host and server first have to agree on a shared language and a shared set of rules. That language is older and more familiar than MCP itself.

What You Already Know That Maps Onto MCP

MCP is less novel than its branding implies, and that is good news. If you have built against a web API, you already hold most of the mental model. The protocol mostly arranges familiar pieces into a strict contract.

What do you need to know before learning MCP (JSON-RPC, APIs, and the client-server model)?

Three pieces of prior knowledge carry almost all the weight.

The client-server model is the first. A client opens a connection and makes requests; a server listens and responds. MCP inherits this directly — the only twist is that the “client” lives inside the host application rather than being a standalone process.

The wire protocol is the second. Every MCP message is JSON-RPC 2.0 — a small, stable standard with three message shapes: requests (expecting a reply), responses, and notifications (fire-and-forget). When you see tools/call or resources/read, those are JSON-RPC method names, nothing more exotic.

APIs are the third, with one upgrade. Calling a known endpoint is ordinary REST. MCP adds standardized discovery: instead of hardcoding which functions exist, the model asks tools/list at runtime and learns what this server offers right now. Discovery happens at runtime, not at compile time — that is the shift that lets a model adapt to servers it was never explicitly programmed against.

MCP organizes all of this into two layers. The data layer is the JSON-RPC protocol itself — the primitives, the method calls, and the lifecycle. The transport layer is the plumbing underneath: the channel that carries the bytes plus authorization. There are two current transports: stdio, where a local server runs as a subprocess and communicates over standard input and output, and Streamable HTTP, where a remote server is reached over HTTP POST.

The protocol is stateful. A connection opens with an initialize handshake in which both sides negotiate a protocol version and declare their capabilities — what each can and cannot do — before any real work begins. For remote HTTP connections, authorization is not optional: MCP mandates OAuth 2.1 with PKCE, and requires Resource Indicators (RFC 8707) so a token issued for one server cannot be silently replayed against another (MCP Specification). As of the 2025-11-25 revision, that is the current shape of the authorization requirement.

MCP topology: a host running multiple clients, each holding a one-to-one connection to a server that exposes tools, resources, and prompts over JSON-RPC — One host coordinates multiple clients, and each client holds a single dedicated connection to one server.

Version & transport notes:
SSE transport deprecated: The standalone HTTP+SSE transport was deprecated in specification revision 2025-03-26 and replaced by Streamable HTTP. The two current transports are stdio (local) and Streamable HTTP (remote). Some vendors hold SSE backward compatibility into 2026, but it is not a current primary transport.
Specification revision: The latest stable revision is 2025-11-25. Revisions are date strings, not semantic versions — older material may reference 2024-11-05, 2025-03-26, or 2025-06-18.

What the Topology Predicts

Once the control model and the one-to-one rule are clear, several of MCP’s everyday behaviors stop being surprising and start being predictable.

If you connect a host to several servers, expect several independent clients and several separate capability handshakes — not one shared session pooling everything together.
If a tool fires without asking you first, the host skipped the human-in-the-loop step the specification recommends. The protocol suggests that approval; it does not enforce it. The gap is an implementation choice.
If a remote server rejects your connection before any tool runs, suspect the OAuth 2.1 authorization layer before you suspect the tool logic — auth sits underneath the data layer and fails first.

Rule of thumb: match the primitive to its controller — model-driven actions are tools, application-supplied context is resources, user-invoked templates are prompts. If you cannot name who initiates a capability, you have not finished designing it.

When it breaks: the one-to-one client-server topology means context does not flow between servers on its own. A tool on one server cannot see a resource on another unless the host explicitly stitches the two together. At scale, that coordination burden lands on the host, not the protocol — and host juggling many servers becomes the place where complexity quietly accumulates.

The Data Says

MCP’s architecture is a deliberate separation of control wearing the costume of a connector. The interesting layer is the contract, not the wire — the contract over who acts: tools answer to the model, resources to the application, prompts to the user. Introduced by Anthropic in November 2024 and refined across several stable revisions since, the protocol’s durability comes from writing that control model down instead of leaving it to whoever builds the integration (Anthropic).

Aha Moments

MAX

What I appreciate here is that the control model is part of the specification, not a convention someone hopes you follow. Tools answer to the model, resources to the application, prompts to the user — and the protocol writes that down explicitly instead of leaving it to whoever builds the integration. That is the difference between a system you can reason about and one you debug by surprise. The one-to-one client-server rule is the same instinct: a hard boundary you can point at when something misbehaves. Most integration bugs I see trace back to ambiguity about who is allowed to act. MCP closes that gap by making the answer part of the contract. Define the boundary up front, and a whole class of “who called this tool” incidents stops happening before it starts.

DAN

Max is right that the contract is the point, and that is exactly why this reaches past engineering. A standard that nails down who controls what is a standard others can build on without renegotiating terms every time. That is how an ecosystem forms. Once major applications and platform vendors agree on the same wiring, every new server becomes instantly useful across every compliant host — the value compounds with each addition. The protocol maturing past its first year with broad adoption is the signal: this stopped being an experiment and became infrastructure. When the plumbing standardizes, the contest moves up the stack to whoever builds the most useful servers. That race is just getting started, and whoever ships early owns the defaults everyone else inherits.

ALAN

Both of you are celebrating the contract, and I want to sit with one word inside it. The specification says a human should be in the loop to approve tool calls. Should — not must. That single word is where the architecture meets the real world. A model-controlled primitive means the model decides when to act, and the only thing between that decision and your file system is an approval step the protocol merely recommends. Standardizing the wiring also standardizes the blast radius when someone quietly drops that step to make a demo smoother. We are encoding, into shared infrastructure, who gets to act on whose behalf. So when a tool runs and something breaks, who exactly authorized it — the user who clicked once, the host that skipped the prompt, or the model that decided it was time?

AI-assisted content, human-reviewed. Images AI-generated. Editorial Standards · Our Editors