Helicone

Also known as: Helicone AI, LLM observability proxy, AI request gateway

Helicone: Helicone is an open-source LLM observability platform that routes AI requests through a proxy to track costs, log every request, and analyze usage. Acquired by Mintlify in March 2026, it entered maintenance mode with no new features planned.

Helicone is an open-source LLM observability platform that routes AI API calls through a proxy to track costs, log requests, and analyze usage patterns in real time.

What It Is

Most teams building on LLM APIs hit the same wall late: they can’t explain their bill. Which requests cost the most? Which prompts triggered retries? Which users drove that spike in token usage? Without visibility into individual API calls, cost attribution is guesswork — and making routing decisions without per-request data means optimizing against estimates instead of measurements. Helicone was built to answer those questions by sitting between your code and the model provider — a transparent proxy that records every request and response without requiring changes to your application logic.

The core mechanism is direct. Instead of sending your OpenAI or Anthropic API requests to the provider’s endpoint, you replace the base URL in your client configuration with Helicone’s proxy endpoint. Helicone receives the request, forwards it securely to the provider, logs the full request and response, and returns the provider’s response to your application. Think of it like a relay station that reads every transmission in transit: the signal reaches the destination unchanged, but the station records what it carried, how long it took, and what it cost. From your code’s perspective, almost nothing changes. From your observability perspective, every call is now timestamped, priced, and searchable.

The platform captures cost per request, latency per call, user-level analytics (when you pass a user identifier as a request header), prompt versioning history, and request replay for debugging failed calls. Custom properties let you tag requests by feature, experiment, or user segment — making it possible to break costs down by product area. According to Helicone Docs, it supports unified gateway access across more than one hundred LLMs including OpenAI, Anthropic, and Google models.

There is a critical status update every team evaluating Helicone must know: According to Helicone Blog, Helicone was acquired by Mintlify on March 3, 2026, and immediately entered maintenance mode. Security updates and bug fixes continue, but no new features, integrations, or product roadmap work will happen. Teams building new observability infrastructure should evaluate actively maintained alternatives. Existing Helicone integrations remain functional for the foreseeable future, but any team planning to rely on Helicone for evolving routing or compliance requirements should plan migration now.

How It’s Used in Practice

The most common scenario is a product team tracking token spend across different prompt versions. Before model routing decisions — such as deciding whether to use a cheaper model for simple queries and a more capable one for complex tasks — teams need baseline cost data per request type. Helicone provides that data without requiring a data engineering project: add the proxy endpoint, tag requests by type, and start reading dashboards within minutes.

In the context of implementing model routing, Helicone gives you the cost visibility to measure whether your routing logic is actually saving money. If you route 80% of queries to a cheaper model, you want to verify that the cheaper model’s failure rate and retry count don’t erase the savings. Helicone’s per-request logging makes that calculation possible — per request, not just per month.

Pro Tip: If you are starting a new project today, use Helicone’s documentation and architecture as a reference — its proxy model is the right pattern — but deploy an actively maintained alternative like Portkey or OpenRouter for production. Helicone will not gain new features, so routing requirements that evolve over the next year may outpace what it can support.

When to Use / When Not

Scenario	Use	Avoid
Existing project with Helicone already integrated	✅
New AI project starting from scratch		❌
Learning how proxy-based LLM observability works	✅
Needing active support for new model providers or integrations		❌
Short-term experiment to measure baseline token costs	✅
Building production routing logic that will evolve over 12+ months		❌

Common Misconception

Myth: Helicone is a fully maintained, enterprise-grade LLM gateway you can build long-term infrastructure on.

Reality: As of March 2026, Helicone is in maintenance mode following its acquisition by Mintlify. The platform remains operational — requests still route, logs still appear, dashboards still work — but no new capabilities will be added. Teams that need evolving observability features, new model support, or new analytics should plan migration to an actively developed alternative.

One Sentence to Remember

Helicone gives you clear visibility into what every LLM call costs and how it behaves — an important baseline before building model routing logic — but because it entered maintenance mode in 2026, treat it as a reference architecture rather than a long-term production dependency.

FAQ

Q: Is Helicone still usable in 2026? A: Yes. The platform remains live and functional — requests route, logs appear, dashboards work. Security updates continue. But no new features or integrations will be added, so teams with evolving needs should evaluate actively maintained alternatives.

Q: How does Helicone track LLM costs without access to billing APIs? A: Helicone uses token counts in each LLM response and known per-token pricing to calculate request-level costs at the proxy layer, not through provider billing integrations. This gives per-request cost data, not just monthly totals.

Q: What is the difference between Helicone and an LLM gateway like Portkey or OpenRouter? A: Helicone focuses on observability — logging, cost attribution, analytics. Portkey and OpenRouter add active routing, failover, and provider switching. There is overlap, but Helicone’s emphasis was always on visibility rather than control.

Sources

Helicone Blog: Helicone is Joining Mintlify - Official announcement of the acquisition and maintenance mode status
Helicone Docs: Cost Tracking & Optimization - Technical documentation for proxy integration and cost tracking capabilities

Expert Takes

MONA

A proxy-based observability layer is elegant in principle: every LLM request passes through a single measurement point, so cost and latency data is complete by construction — no sampling, no instrumentation gaps. The architecture also decouples your observability concerns from your model-provider contracts. That decoupling is why Helicone’s maintenance mode matters less for understanding the pattern than for planning which tool actually implements it in your stack.

MAX

Before you commit to any routing strategy, you need request-level cost data — and that data has to come from the actual path your requests take, not an estimate. Helicone’s one-line integration demonstrates the right spec: intercept at the HTTP transport layer, capture headers and body, log asynchronously, return the upstream response unchanged. Any replacement tool should meet the same integration contract. Migration from Helicone is direct because the proxy model is a standard pattern.

DAN

Mintlify acquiring Helicone closed one of the cleaner entry points to LLM cost visibility, but it also confirmed the pattern: observability infrastructure for AI is valuable enough to acquire, and the teams that built it understood the proxy model was the right abstraction. The question now is which actively developed tool inherits that space. Tools in maintenance mode fall behind fast, and the routing requirements this market is generating will outpace what Helicone can support.

ALAN

A proxy that logs every LLM request also logs every piece of data those requests contain. Teams using Helicone for user-facing features should check what data their requests carry and whether that data passes through Helicone’s servers — or whether a self-hosted deployment addresses the concern. Observability and privacy are not automatically in conflict, but they require explicit decisions about data residency, not defaults.

Back to Glossary