Docker Sandboxes Are a Meaningful Step Toward Safer Coding Agents — Here’s What Still Matters

Docker Sandboxes Are a Meaningful Step Toward Safer Coding Agents — Here’s What Still Matters

Shawnee Foster's avatar
Shawnee Foster
DECEMBER 19, 2025
4 MIN READ
THOUGHT LEADERSHIP
Rays decoration image
Ghost Icon

Docker recently announced Docker Sandboxes, a lightweight, containerized environment designed to let coding agents work with your project files without exposing your entire machine. It’s a thoughtful addition to the ecosystem and a clear sign that agent tooling is maturing.

Sandboxing helps solve an important problem: agents need room to operate. They install packages, run code, and modify files — and giving them that freedom without exposing your laptop makes everyone sleep a little better.

But environment isolation only addresses one slice of the risk. A sandbox controls where code runs and which local files an agent can modify. It doesn’t control what the agent is allowed to do across systems — or how confidently it interprets the task you gave it.

And that’s where most real-world issues show up.


What Docker Sandboxes Solve Well

Docker’s approach is well aligned with what developers need today:

  • Environment isolation
  • Filesystem boundaries
  • Reproducible workspaces
  • Protection from runaway or untrusted local code, including destructive filesystem actions
  • Support for modern coding agents like Claude Code and Gemini CLI

For local workflows, this reduces a huge category of risk without slowing anyone down. It’s likely to become the default way coding agents run on desktops.

But even inside a perfect container, an agent can still make the wrong high-level choice.


Where Most Agent Failures Actually Occur

Across the industry, teams experimenting with coding agents have seen a consistent pattern:

The agent behaves correctly — but outside the intent of the request.

Common examples include:

  • Merging a PR that was meant only for review
  • Rewriting configuration files it believes are outdated
  • Deleting test data that “seems unused”
  • Refactoring files in ways that pass tests but subtly change behavior
  • Cleaning up directories more aggressively than intended
  • Treating ambiguous content (logs, comments, emails) as instructions
  • Issuing destructive commands against live databases because it lacks production context

No exploit. No sandbox escape. No malice.

Just capability, confidence, and permissions that were broader than the task required.

Importantly, these failures aren’t caused by unsafe code execution — execution sandboxing is a necessary foundation, and it does its job well. They arise because sandboxing alone typically isolates only the agent’s execution — the process and filesystem it runs in — without constraining the capabilities the agent is authorized to use across systems. Sandboxing adds constraints to where the agent can run code and what local resources it can access. It does not constrain:

  • what permissions the agent holds
  • which tools or APIs it can call
  • what actions those tools are allowed to perform
  • how broadly the agent interprets ambiguous instructions

Those controls live above the sandbox boundary, at the decision, permission, and policy layers.

A sandbox can prevent unintended actions from affecting your machine.It can’t determine whether the action itself was appropriate.

That’s a different kind of safety.

Once you draw that boundary clearly, the shape of a safer agent architecture becomes much more obvious.


A More Complete Approach to Agent Safety

Here’s the layered model many teams have converged on as they adopt agents into real workflows. This model describes complementary layers that work together. In practice, teams often use several of these at once — including sandboxing — to achieve meaningful safety.

1. Least Privilege Access

Agents should never inherit the full set of capabilities a human has.

Limit-by-default prevents the majority of unwanted outcomes:

  • Read vs. write
  • Scoped access to specific directories or repositories (enforced at the filesystem or authorization layer)
  • Review vs. merge
  • Comment vs. commit

If an agent doesn’t have permission to take a sensitive action, it can’t accidentally take it.

2. Proper Authentication & Authorization

Environment isolation protects the machine.Permissions protect the systems.

The strongest patterns emerging include:

  • User-scoped OAuth with precise, minimal scopes
  • Just-in-time authorization instead of global tokens
  • Zero exposure of credentials to the model
  • Fine-grained control at the tool/action level

This prevents “confident but unintended” actions from reaching beyond their appropriate scope.

3. Execution Sandboxing (Docker’s part)

This layer handles:

  • untrusted code execution
  • local dependencies
  • package installs
  • runtime containment
  • resource boundaries

Docker’s solution fits this layer extremely well.

4. Auditing and Traceability

When something unexpected happens, teams need to see:

  • what the agent saw
  • what it understood
  • what it decided
  • what it executed
  • what the system allowed

This isn’t only for security — it’s also for debugging and trust-building.

5. Human Approval for High-Impact Actions

Agents draft.Agents propose.Agents prepare.

But merges, deletions, permission changes, and other irreversible operations still benefit from a human-in-the-loop step.

Think of it less as a restriction and more as a guardrail around intent.


How These Layers Work Together

  • Sandboxing protects the environment
  • Least privilege protects the system
  • Auth protects identity and access
  • Auditability protects understanding
  • Human review protects intent

When these layers align, agents become dramatically safer — not because the models improve, but because the architecture does.


A Collaborative Future for Agent Safety

Docker’s announcement is a positive signal. It reflects a broader shift toward treating agent safety as architecture, not an afterthought. 

Execution sandboxing is an important foundation. But as agents move beyond local, single-user workflows, teams increasingly need controls that sit above the execution layer: user-scoped authorization, tool-level permissions, auditability, and centralized visibility into agent behavior.One approach we see gaining traction is to centralize those concerns into a dedicated control plane, rather than rebuilding them inside every agent or tool. This is the model behind Arcade — an authorization-first MCP runtime that handles permissions, governance, and visibility across multi-user agents, while remaining independent of where or how those agents execute.

Sandboxing keeps execution safe. Centralized authorization and governance help keep agent behavior aligned as systems scale.


If you’re building multi-user agents and thinking through these layers, you can sign up for Arcade for free to explore an authorization-first MCP runtime →

SHARE THIS POST

RECENT ARTICLES

THOUGHT LEADERSHIP

What It’s Actually Like to Use Docker Sandboxes with Claude Code

We spend a lot of time thinking about how to safely give AI agents access to real systems. Some of that is personal curiosity, and some of it comes from the work we do at Arcade building agent infrastructure—especially the parts that tend to break once you move past toy demos. So when Docker released Docker Sandboxes, which let AI coding agents run inside an isolated container instead of directly on your laptop, we wanted to try it for real. Not as a demo, but on an actual codebase, doing the k

THOUGHT LEADERSHIP

Build on the Bubble: Why foundation model instability is the best thing that ever happened to enterprise AI

Right now, somewhere in San Francisco, a foundation model company is losing money serving your API call. OpenAI spent $8.67 billion on inference in the first nine months of 2025—nearly double their revenue for the same period. Sam Altman publicly admitted they lose money on $200-per-month ChatGPT Pro subscriptions. Anthropic burns 70% of every dollar they bring in. These companies are pricing their products below cost, subsidized by the largest concentration of venture capital in technology his

COMPANY NEWS

Your MCP Client Just Got Superpowers: Arcade Tools are now in Cursor, VS Code, and more

If you've been using Cursor, Claude Desktop*, VS Code, or any MCP-compatible client, you've probably experienced the same frustration: your agent is brilliant at reasoning through workflows, but the moment it needs to actually do something across your tools, you're back to juggling configurations for each individual tool, debugging auth flows, and troubleshooting why the setup that worked yesterday doesn't work today. Those days are over. With the launch of Arcade MCP Gateways, your MCP client

Blog CTA Icon

Get early access to Arcade, and start building now.