managed-agents-2026-04-01. All pricing, capabilities, and architecture details verified against Anthropic’s official Managed Agents overview, the Claude Platform release notes, and the Anthropic engineering post published the same day. Written by Ahmad Lala.
Anthropic just launched Claude Managed Agents, and the framing matters more than most launches this year. This is not another tool-use tweak, another model card, or another benchmark. It is Anthropic quietly stepping up the stack – from selling access to a model, to selling access to a worker you can dispatch work to. That is a completely different product.
The 60-Second Brief
Anthropic just launched Claude Managed Agents, a fully managed agent harness that runs Claude as an autonomous worker inside Anthropic’s own infrastructure – sandboxed containers, persistent sessions, built-in tools, memory stores, and event streaming, all exposed through a single API.
The short version: instead of asking developers to wire together their own agent loop, sandbox, tool execution, permission system, and long-running infrastructure, Anthropic is packaging all of that as a managed product. You create an agent, point it at a configured container, start a session, send events, and stream results back. The hard parts – state, recovery, credentials, tool routing – are handled on Anthropic’s side.
Not inside Claude Code. Not inside claude.ai. It’s a new surface on the Claude API.
What it is not
- Not a feature inside Claude Code (the CLI)
- Not a feature inside the Claude desktop or mobile apps at claude.ai
- Not something end-users at your company will see or click on
- Not a replacement for the Messages API – it runs alongside it
What it is
- A new set of Claude API endpoints at platform.claude.com
- Called programmatically from your code, the same way you’d call
/v1/messagestoday - Accessed via the official SDKs (Python, TypeScript, Go, Java, C#, Ruby, PHP) or the brand-new
antCLI that shipped alongside it - Enabled by default for every Anthropic API account during the public beta
The absolute minimum to get started
TL;DR for the “does this sit inside Claude Code?” question: no. Claude Code is the terminal CLI you already know. Claude Managed Agents is an API product your backend talks to. You can absolutely build a Claude Code–style experience on top of Managed Agents, but they are separate surfaces with separate audiences – Claude Code is for developers working locally, Managed Agents is for your application dispatching autonomous work to Anthropic’s cloud.
If you have been watching agent launches drift for the last twelve months, this is the first one that reads like a real platform. The rest of this article unpacks what it actually is, why it matters, what it costs, what to build with it first, and – just as important – where it will bite you.
Messages API vs Managed Agents vs the Agent SDK
The single most useful thing you can do in the first hour of a launch like this is to place it correctly on the map. Anthropic now has three distinct ways to build with Claude, and they do not compete – they target different problems.
You run the loop. Anthropic runs the model.
Best for custom agent loops, fine-grained control, synchronous request/response, and anything that fits within a single prompt or a short chain you orchestrate yourself.
Use it when: latency matters, you want full control of the loop, or the task is short and stateless.
You send the task. Anthropic runs the worker.
Pre-built agent harness running inside Anthropic’s managed infrastructure. Containers, sessions, tools, MCP, memory, permissions, credentials, and event streaming – all handled for you.
Use it when: the task runs for minutes or hours, needs a persistent filesystem, multiple tool calls, or secure credential handling.
You host the harness yourself.
The same agent loop that powers Claude Code, as a Python or TypeScript library. You run the sandbox, you deploy the infra, you own operations.
Use it when: you have strict data residency requirements, you already operate a sandbox stack, or you need to customise the harness itself.
Anthropic’s own documentation draws the Messages API / Managed Agents split in exactly this way: direct model access for custom loops and fine-grained control, versus a pre-built hosted harness for long-running tasks and asynchronous work. (There is now a middle option: the advisor strategy lets a cheaper executor model consult Opus for strategic guidance inside a single API request – no infrastructure, no long-running sessions.) The Agent SDK sits behind both – it is the programmable library that lets you host the same harness yourself if you need to.
Quick decision helper: which Claude surface should you use?
Answer three questions. The recommendation updates live.
Short synchronous prompts with minimal state belong on the Messages API. You get the lowest latency and full control of the loop.
If you still aren’t sure, the default answer for most teams is: start on the Messages API, and graduate to Managed Agents the first time you find yourself building your own sandbox or session log. That is the signal that you’ve crossed into runtime work.
The four concepts that make up a Managed Agent
Anthropic’s documentation is unusually clean here. The whole product reduces to four nouns, and once you internalise them the rest of the API basically explains itself.
Agent
The model, system prompt, tools, MCP servers, and skills. Defined once, reused across sessions.
Environment
A container template with pre-installed packages, network rules, and mounted files.
Session
A running agent instance doing a specific task, with a persistent filesystem and event history.
Events
User turns, tool results, and status updates streamed back as server-sent events.
The mental shift this forces on you is important: you are no longer managing a conversation – you are assigning work to a worker that runs in Anthropic’s cloud. That is the product, and it is why the pricing model below is structured the way it is.
The unsexy breakthrough: brain, hands, and the session log
The most interesting part of the launch is not in the marketing page. It is in the engineering post, where Anthropic describes the architecture as a deliberate operating-systems-style decoupling of three things: the brain (Claude and its harness), the hands (the sandbox and the tools), and the session log (an append-only event stream that lives outside both).
The architecture breakthrough
Anthropic’s engineering post describes Managed Agents as a deliberate decoupling of three concerns. The harness no longer lives inside containers – it calls them through a simple execute(name,input)→string interface. When the brain dies, a new instance calls wake(sessionId) and resumes from the last recorded event. That’s what makes containers interchangeable rather than precious.
The interface surface is small on purpose. The engineering post enumerates five primitives that the harness uses to talk to the session and the sandbox, and the whole product is built on top of them:
execute(name, input) → string // run a tool in the sandbox
wake(sessionId) // recover a harness, resume from last event
getSession(id) // load session metadata
emitEvent(id, event) // append to the durable log
getEvents() // read event history back into contextRead that list carefully. It is an operating-systems interface for agents. The harness is stateless, the session is durable, the sandbox is disposable, and every interaction with the outside world flows through a tiny, stable surface. That is what makes Managed Agents a platform rather than a framework.
Why does this matter? Because Anthropic’s previous architecture bundled the harness, the container, and the state into a single long-lived thing. That turned every session into a pet: you had to keep it alive, you paid full setup costs whether you needed them or not, and when something crashed the whole workflow died with it. Decoupling the brain from the hands means containers become interchangeable, failures become recoverable, and – crucially – cold-start overhead collapses.
There is a second, subtler idea in the engineering post worth pulling out. Because the session is an append-only event log that lives outside Claude’s context window, it functions as an interrogable context object: the harness can fetch it, transform it, rewind it, and decide what to replay into the model on each turn. That is a real break from the “truncate or summarise” dance every long-running agent eventually hits. Anthropic’s framing is explicitly operating-systems-style — virtualise the agent components, be opinionated about the interfaces, stay agnostic about what lives behind them. It is the same instinct that gave us processes, file descriptors, and sockets, now applied to agents.
Field note: the “context anxiety” story
The best anecdote in the engineering post is one you will not see on the marketing page. While scaling earlier harnesses, Anthropic’s engineers found that Claude Sonnet 4.5 showed what they called context anxiety — degraded behaviour when sessions grew long — so they shipped context-reset logic to work around it. On Claude Opus 4.5, that workaround became unnecessary and was removed. That is the single clearest argument for why a harness should not hard-code assumptions about the model: the intervention you ship today is the technical debt you rip out next quarter. The old coupled containers were “infrastructure pets” you could not let fail without losing state. Decoupling the brain from the hands is how you stop that debt from calcifying into the product.
Time-to-first-token: before vs after
Anthropic reports roughly 60% p50 improvement and more than 90% p95 improvement after decoupling the harness from the sandbox. Illustrative values shown below in relative units; the shape is what matters.
Anthropic reports roughly a 60% reduction in p50 time-to-first-token and more than 90% at p95. That second number is the operationally important one: it is the tail latency that users actually notice, the one that turns agents from “neat demo” into “thing you trust with production workloads.” Decoupling is how you get there.
How Managed Agents fits the 2026 Claude roadmap
- Oct 16, 2025Agent Skills launched – portable, on-demand capability packs for Claude (Excel, Word, PowerPoint, PDF, plus custom). The first piece of the agent puzzle.
- Nov 24, 2025Claude Opus 4.5 + programmatic tool calling – tools become callable from inside code execution, cutting latency on multi-tool workflows.
- Feb 5, 2026Claude Opus 4.6 + compaction API – server-side context summarisation makes effectively-infinite conversations possible. A prerequisite for long-running agents.
- Feb 17, 2026Claude Sonnet 4.6 + 1M context GA – the workhorse model for agent loops gets cheaper and longer.
- Apr 8, 2026 • TodayClaude Managed Agents +
antCLI launch – the harness, the runtime, and the developer ergonomics all land in the same release. The pieces finally click into a product.
Read in sequence, the 2026 release cadence tells a story: Skills gave Claude portable capabilities, Opus 4.5 and 4.6 made tool calling cheap and multi-step reasoning cheaper, the compaction API made infinite-context possible, and Sonnet 4.6 made the workhorse model fast enough to run for hours without breaking the bank. Managed Agents is the piece that binds all of it together into something you can actually ship. It is less a new launch than a capstone.
What Claude can actually do inside a session

Managed Agents ships with a first-class set of built-in tools so you don’t have to bring your own sandbox. Claude can run bash commands inside the container, read and write files, glob and grep, search the web, fetch URLs, and call any MCP server you connect. It can also use skills – Anthropic-managed ones for Excel, Word, PowerPoint, and PDF handling, or your own custom skills uploaded through the API.
For enterprise buyers the more interesting details are the ones that don’t show up in model benchmarks. Permissions are configurable per tool with always_allow or always_ask policies, and MCP toolsets default to always_ask – a small but important detail that prevents new tools from silently gaining access when they are added to a server.
Credentials are handled through vaults. You store a secret once, reference it by ID when creating a session, and Anthropic injects it at runtime for MCP connections. Secret fields are write-only, nothing leaks into prompts or API responses, and critically nothing reaches the sandbox where Claude’s generated code actually executes. If you have ever tried to hand-roll credential handling for an agent, you know why this matters.
The engineering post is more specific than the marketing page about how this is wired, and two details are worth calling out. First, for Git-backed tools, the access token never enters the sandbox at all — it is attached to the local remote outside the sandbox boundary, so Claude can clone, pull, and push without ever seeing the credential. Second, for OAuth-backed custom tools, the tokens live in a separate vault and are reached only through dedicated proxies that the harness itself cannot interrogate directly. In other words, Claude can use the credentials via the tool but cannot read them. That is the difference between an agent you can give production scopes to and one you absolutely cannot.
Memory stores – currently in research preview – are the other operator-grade feature. A memory store is a workspace-scoped collection of documents the agent consults before starting a task and writes durable learnings into when done. Every change creates an immutable memory_version that supports auditing, rollback, and redaction. You can attach up to eight stores per session, mix read-only and read-write access, and scope them per-user, per-team, or per-project. It is the most compliance-aware memory system I have seen from a frontier lab, and it is a strong signal about who Anthropic is building this for.
Pricing: finally, a number you can benchmark against
This is where the launch gets sharp. Managed Agents bills on two dimensions: normal model token usage at standard rates, plus session runtime at $0.08 per session-hour. Web search inside a session is still $10 per 1,000 searches. That is it. No mystery enterprise pricing, no “contact sales” fog, no bundled surprise.
This matters editorially because you can now write a budget for autonomous Claude work the same way you’d write one for an EC2 instance or a Lambda function. Token cost plus runtime cost. The calculator below gives you a rough feel – drag the sliders for your workload shape. If you also want to estimate the token side in more detail before you commit, pair this with our Claude Code Token Calculator, which is set up for exactly that job.
Interactive: What will a Managed Agents workload cost?
A rough back-of-envelope estimator. Session runtime is billed at $0.08/hour. Model tokens are billed on top at standard rates. Web search is $10 per 1,000 searches.
A simple way to read it: a team running 20 one-hour agent sessions per day on Sonnet 4.6 spends somewhere in the range of tens of dollars a day on runtime, plus whatever their token bill looks like. That is cheap enough to experiment with today, and legible enough to defend in a budget meeting tomorrow. That combination is what makes it a real infrastructure product rather than a demo.
Use cases that feel real, not generic
The Notion onboarding demo is the one to anchor on. Notion showed a Managed Agent working through a long client-onboarding checklist while a dashboard exposed exactly which tools the agent was using at each step. That is the story that sells the product to non-developers, because it reframes agents as operational work that can be delegated and observed, not magic.
Customer onboarding runners
Notion’s launch demo showed a Managed Agent chewing through a long client-onboarding checklist, operating inside Notion itself while a dashboard exposed each tool call in real time.
Analyst copilots that run for hours
Long-running research sessions that browse, pull reports, manipulate spreadsheets with the Excel skill, and produce a briefing at the end – all with persistent memory.
Content & research agents
Agents that research a topic, fetch sources, draft, critique, and revise inside a container with web fetch enabled – a natural fit for editorial workflows.
Internal ops agents
Where the moat is permissions and credentials, not model intelligence. Vaults inject secrets at runtime so tokens never touch prompts or generated code.
Support & ticket triage
Sessions attached to a per-user memory store carry forward account context, tone preferences, and past resolutions across every new ticket.
Async software tasks
File manipulation, migrations, codebase refactors, doc cleanups – exactly the kind of multi-tool, minutes-long work that chokes the Messages API.
Notice the pattern in all of these: they are cases where a single synchronous prompt is the wrong shape, and where building your own harness would be the biggest cost. Managed Agents is aimed squarely at that middle ground.
Five operator-grade use cases that map cleanly to existing blueprints
If you’re looking for concrete workflows to port, the easiest wins are the ones we’ve already documented on ChatGPT Guide -they were designed for exactly the “long-running, stateful, multi-tool” shape Managed Agents was built to host.
1. Month-end finance close. The AI Agent for Month-End Close Automation blueprint is almost a textbook match: long runtime, file manipulation, multiple systems via MCP, credentials that absolutely must not leak. The vault story alone is worth the migration.
2. Regulatory document review. The regulatory review agent blueprint is a natural fit because a read-only memory store can carry the regulation corpus, while a read-write store captures review notes across sessions. Audit trail through memory versions is a compliance bonus.
3. Weekly status reporting. The weekly status report automation workflow runs perfectly well on a once-a-week Managed Agents session – web fetch for data sources, the docx skill for the output file, a per-team memory store for tone and formatting conventions.
4. GDPR Article 35 DPIA workflows. The DPIA workflow blueprint is exactly the kind of task where always_ask permissions and versioned memory are more valuable than model intelligence. Compliance reviewers will love the redaction endpoint.
5. Knowledge-heavy support triage. Per-customer memory stores carry account context forward across tickets, MCP servers connect to the ticketing and CRM stack, and vault-backed credentials keep tokens out of every transcript. This is the shape of workflow Managed Agents was most obviously designed for.
What to build with it this week
If you want to get hands-on with Managed Agents before the rest of the market catches up, pick a workflow that is currently painful to run on the Messages API because it is too long, too multi-tool, or too stateful. Three concrete starter ideas:
1. A client onboarding runner. Put a structured checklist in a memory store, attach a read-write store for per-client progress, and let the agent work through the steps in a container with your MCP server connected. This is almost exactly the Notion demo, and it is the fastest path to a useful production agent.
2. A weekly research brief agent. Session runtime + web search + the docx skill. Input: a topic and a list of sources. Output: a structured Word document delivered to a shared folder. Attach a per-user memory store so the agent learns your voice and formatting conventions over time.
3. A support ticket triage agent. Sessions attached to a per-customer memory store, MCP servers for your ticketing and CRM systems, vault-backed credentials, and always_ask permissions for anything that mutates data. This is the shape of workflow Managed Agents was most obviously designed for.
⚠ What will break first (an honest field note)
- Permission fatigue. MCP toolsets default to
always_ask. That’s safer, but a long session with a lot of tool calls is going to produce a lot of approval prompts. Plan your UI for it or you’ll train users to click through blindly. - Session-hour creep. $0.08/hour feels tiny until an agent loops on a stuck task for six hours overnight. Set hard session timeouts on your side and monitor runtime the way you monitor spend on a cloud function.
- Memory-store drift. Memory is in research preview. It will accumulate wrong facts if you don’t build a review workflow. Use the version endpoints to audit changes and redact bad writes before they harden into “truth.”
- Beta header churn.
managed-agents-2026-04-01is version-dated for a reason. Expect behaviours to refine between releases – don’t pin critical workloads on undocumented quirks. - The “runs for hours” trap. Long-running is a feature, not a target. Most workflows that feel like they should run for hours are actually three workflows glued together badly. Break them up before the bill trains you to.
None of these are dealbreakers, and most of them are the sort of caveats you get with any new infrastructure product. But if you skip past them in the first week, you will hit every single one in the second.
The real story: Anthropic is becoming an infrastructure company

Most launch coverage you will read will focus on “Claude can now run agents.” That is not the interesting story, because Claude could already run agents. The interesting story is that Anthropic is taking the unsexy parts of running agents – containers, state, recovery, permissions, credentials, memory – and turning them into a productised runtime layer.
That is a much more important shift than another model update. It pushes Anthropic further up the stack, deeper into customer workflows, and higher-margin at the same time. It also sets up a very clean competitive story: if you want the model, use the API. If you want the worker, use Managed Agents. If you want to host the worker yourself, use the Agent SDK.
The line I’d use in a meeting: this launch is less about agent demos and more about Anthropic trying to own the production layer underneath every autonomous Claude workload on the planet. It’s an infrastructure play dressed up as an agent launch, and the pricing, the permissions story, and the memory architecture all point in the same direction.
Managed Agents glossary (so future-you and future-AI can parse this cleanly)
- Agent
- A reusable configuration: model, system prompt, tools, MCP servers, skills. Defined once, referenced by ID.
- Environment
- A container template with pre-installed packages, network access rules, and mounted files.
- Session
- A running instance of an agent inside an environment, performing a specific task. Has its own filesystem and event log.
- Event
- Anything exchanged during a session: user messages, tool calls, tool results, status updates, streamed via SSE.
- Memory store
- A workspace-scoped collection of text documents the agent reads before starting a task and writes learnings into when done.
- Memory version
- An immutable snapshot created every time a memory is mutated. Supports audit, rollback, and redaction.
- Vault
- Write-only secret storage. Referenced by ID at session creation, injected at runtime, never returned in API responses.
- Skill
- A packaged capability (Excel, Word, PDF, custom) loaded on demand rather than bloating the system prompt.
- Beta header
managed-agents-2026-04-01– required on every Managed Agents request during the beta period.
Frequently asked questions
What is Claude Managed Agents in one sentence?
It’s a fully managed agent harness from Anthropic that runs Claude as an autonomous worker inside sandboxed containers, with built-in tools, persistent sessions, memory stores, and event streaming – all exposed through the Claude API.
How is it different from the Messages API?
The Messages API gives you direct access to the model and expects you to build your own loop. Managed Agents gives you the whole harness: container, tools, state, permissions, and a session log. The Messages API is for prompts; Managed Agents is for long-running work.
How is it different from the Claude Agent SDK?
The Agent SDK is a Python/TypeScript library that lets you build and host an agent loop yourself. Managed Agents is the hosted version of that same idea, run on Anthropic’s infrastructure. If you don’t want to operate sandboxes, you want Managed Agents.
What does it cost?
Anthropic bills on two dimensions: normal model token usage and session runtime at $0.08 per session-hour. Web search still costs $10 per 1,000 searches inside a session. Use the calculator above for a rough estimate.
Is it generally available?
It launched on April 8, 2026 in public beta. All endpoints require the managed-agents-2026-04-01 beta header. Memory stores, multi-agent, and outcomes are in research preview behind an access form.
What tools can agents use out of the box?
Bash, file operations (read, write, edit, glob, grep), web search, web fetch, and any MCP server you connect. Skills for Excel, Word, PowerPoint, and PDF handling are also available.
How does security and credential handling work?
Secrets live in vaults, are referenced by ID at session creation, and are never returned in API responses. Anthropic injects them at runtime for MCP connections, so tokens don’t appear in prompts, logs, or generated code.
Can agents remember things across sessions?
Yes, through memory stores (research preview). A memory store is a workspace-scoped collection of documents that the agent checks before starting a task and writes durable learnings to when done. Every change creates an immutable version for audit, rollback, and redaction.
What are the rate limits?
60 create requests per minute (agents, sessions, environments) and 600 read requests per minute per organization. Standard tier-based spend limits also apply on top.
Should I migrate my existing Claude integration?
Only if your task is long-running, multi-tool, or needs a persistent filesystem. Short synchronous prompts belong on the Messages API. Long autonomous work belongs on Managed Agents. Everything in between is a judgement call.
Who is this really for?
Developers and teams who want to ship autonomous agents without running their own container infrastructure, sandbox, permission system, credential vault, and session log. If you’ve already built that layer, the Agent SDK probably still fits better.
Last updated: April 8, 2026. Based on Anthropic’s official Managed Agents documentation, the Claude Platform release notes, and the engineering post on scaling Managed Agents published the same day. Written and tested by Ahmad Lala for ChatGPT Guide.

