OpenAI Agents SDK

Also known as: Agents SDK, openai-agents, @openai/agents

OpenAI Agents SDK
OpenAI’s official open-source framework for building agentic and multi-agent systems with five primitives — Agents, Handoffs, Guardrails, Sessions, Tracing — built on the Responses API and provider-agnostic via LiteLLM adapters. Production successor to the experimental Swarm project.

The OpenAI Agents SDK is OpenAI’s official open-source framework for building production multi-agent systems, with handoffs, guardrails, sessions, and tracing exposed as built-in primitives rather than glue code teams write themselves.

What It Is

When teams started wiring large language models into multi-step workflows — pulling data, calling tools, then deciding what to do next — they discovered that production agents need plumbing the chat completion API never provided: state across turns, safe handoffs between specialized agents, input and output validation, and visibility into what each step actually did. The OpenAI Agents SDK exists to give that plumbing a shape, so teams stop reinventing it for every project. It’s the production successor to Swarm, OpenAI’s earlier experimental project that demonstrated the patterns but was never meant to be deployed.

The SDK is deliberately small. According to OpenAI Agents SDK Docs, it has “very few abstractions” and centers on five primitives. An Agent is a language model paired with instructions and a tool list. A Handoff lets one agent delegate work to another, treating the other as a peer rather than a subroutine. A Guardrail validates inputs before they reach the model and outputs before they reach the user — refusing leaked PII or off-topic requests. A Session keeps conversation state across turns without forcing developers to manage their own memory layer. Tracing records every model call, tool invocation, and handoff so engineers can debug agent behavior the way they debug regular code.

According to OpenAI Agents SDK Docs, the framework runs on OpenAI’s Responses API by default but isn’t locked to OpenAI. Through LiteLLM and Any-LLM adapters, the same Agent definition can call Anthropic, Google, or open-weight models. It ships in two languages — Python (openai-agents-python) and TypeScript (@openai/agents, repo openai-agents-js) — with feature parity for handoffs, guardrails, tracing, and realtime voice agents. The TypeScript edition followed the Python launch in 2025 to give frontend and full-stack teams the same surface their backend colleagues already used.

How It’s Used in Practice

The most common entry point is a small team building an internal automation that needs more than a single prompt — a customer-support assistant that triages tickets and hands the technical ones to a specialist agent, or a research assistant that fans out a question across web search, vector retrieval, and summarization, then merges the results. Developers define each agent as a Python class with a name, instructions, and a tool list. They wire a handoff with one decorator. They get a working orchestration in a few hundred lines.

The SDK’s “agents-as-tools” pattern is what makes that practical. A parent agent treats other agents like callable functions, but the child agent retains its own model, instructions, and guardrails. Compared to chaining raw API calls, this saves teams from writing their own router, their own retry logic, and their own trace capture.

Pro Tip: Turn on tracing from day one and view runs in the OpenAI dashboard. Agent bugs are rarely model bugs — they’re handoff bugs, tool-schema bugs, or prompt-injection bugs. A trace shows you exactly which agent made which decision with which inputs, which is the difference between a thirty-minute fix and a two-day investigation.

When to Use / When Not

ScenarioUseAvoid
Building a multi-agent workflow primarily on OpenAI models
Need a single one-shot prompt with no tools or persistent state
Want a minimal-abstraction framework with direct control over the loop
Require graph-based orchestration with explicit cycles and branches
Mixing OpenAI with Anthropic or open-weight models via LiteLLM
Locked into the legacy Assistants API for an existing product

Common Misconception

Myth: The OpenAI Agents SDK is just a rebrand of Swarm with better marketing. Reality: Swarm was an experimental research prototype that OpenAI labeled as “not for production.” The Agents SDK is a separate, production-targeted framework with new primitives — Guardrails, Sessions, and Tracing did not exist in Swarm — and it runs on the Responses API rather than the older Chat Completions endpoint.

One Sentence to Remember

If your project needs more than a single prompt and you’re already on OpenAI, the Agents SDK is the lowest-overhead way to ship a multi-agent workflow with handoffs and tracing built in — start with one agent, add a second only when handoff logic earns its weight.

FAQ

Q: Is the OpenAI Agents SDK free? A: The SDK itself is open source. You pay only for the underlying model calls — OpenAI tokens, or whatever provider you route through LiteLLM, plus any optional dashboard tracing your team enables.

Q: Does the OpenAI Agents SDK work with non-OpenAI models? A: Yes. According to OpenAI Agents SDK Docs, the SDK is provider-agnostic through LiteLLM and Any-LLM adapters, so the same Agent definition can call Anthropic, Google, or open-weight models without rewriting orchestration logic.

Q: Is OpenAI Agents SDK the same thing as Swarm? A: No. Swarm was an experimental project, now deprecated. According to InfoQ launch story, the Agents SDK launched in March 2025 as a separate production-ready framework with built-in primitives Swarm never had.

Sources

Expert Takes

The SDK’s design choice is conceptually clean: handoffs are not function calls, they are transfers of control between peer agents, each with its own instructions and tools. That distinction matters because it changes how state is reasoned about. Sessions persist across handoffs, but the model evaluating each step is bounded by what its own prompt and tool list permit. Not orchestration magic. A type system for delegation.

Each primitive maps cleanly to a spec-driven workflow. An Agent’s instructions are its system prompt — the spec for what it does. Tools are the spec for what it can touch. Guardrails are the spec for what it must refuse. Sessions are the spec for what it remembers. Tracing is the spec for how you’ll debug it. Define those specs and the orchestration code becomes wiring, not logic.

OpenAI shipping an opinionated SDK signals where the agent layer is consolidating. Teams that picked LangGraph or CrewAI early aren’t abandoning ship overnight, but new builds inside organizations already buying OpenAI tokens default to the official tool. That pattern compounds. The agent framework category was crowded a year ago. It’s narrowing now, and the question for buyers is no longer which framework to learn but which to standardize on.

A production agent framework from the model vendor concentrates an interesting amount of decision-making authority in one place. The SDK chooses what counts as a guardrail, which traces are visible, which handoffs are safe. Open-source license, official docs, and provider-agnostic adapters soften the lock-in story. They don’t eliminate it. When the same company writes the model, the orchestration runtime, and the observability layer, who audits the audit?