AI agents are no longer passive assistants. They write code, call APIs, install packages, and interact with production systems. This shift from passive to active changes not only the usefulness of agents, but also the security question around their activities entirely.
When an agent can only generate text, the worst outcome is a bad answer. When an agent can execute code, the worst outcome is a deleted production database. That happened last month. 9 seconds, no rollback,no recovery.
The question every enterprise team hits sooner or later: how do you safely allow AI agents to execute code and interact with enterprise systems?
In a recent post, we outlined a 6-layer defense-in-depth framework for enhancing the security posture of AI agents on Red Hat AI. This post goes deeper into one of those layers: secure sandboxed execution, and how we validated it across various agent frameworks, APIs, and platforms.
Running agent-generated code directly on developer laptops, shared infrastructure, or unrestricted runtime environments introduces real risk. Enterprises need isolation boundaries, policy enforcement, credential protection, and runtime audit controls before these systems can be trusted in production.
At Red Hat, we have been working on bringing secure sandboxed execution into Red Hat AI through a collaboration with NVIDIA on OpenShell. Today we want to share where we are and what we learned building it.
3 ways to sandbox an agent
After studying more than 20 sandboxing solutions, we found that every approach falls into 1 of 3 modes. Knowing which mode you need matters more than which solution you pick.
Figure 1: 3 modes of agent sandboxing
Mode 1: Sandbox the entire agent. The agent process, all its tool calls, and all code execution run inside a single sandbox boundary. Nothing goes in or out without explicit approval. The agent holds no credentials, reaches no external services except through a policy-enforced proxy. This is the right choice for developer laptops and CI/CD pipelines where the host machine holds real credentials that the agent should never see.
Mode 2: Sandbox as the execution environment. The agent's "brain" (reasoning and orchestration) is decoupled from its "hands" (tool execution and code). The platform orchestrates the agent loop and delegates execution to disposable, stateless sandboxes that you control. Credentials are physically separated from the execution environment, injected at the network boundary rather than stored where agent-generated code can reach them. Both the Responses API and Anthropic's Managed Agents follow this pattern, whether the sandbox runs in the provider's cloud or on your own infrastructure through self-hosted environments. This is the right choice for multi-tenant agent platforms and production workloads.
Mode 3: Sandbox only the code execution. The agent logic runs on your infrastructure with full access to credentials and APIs. Only the code it generates gets routed to an isolated environment. Unlike Mode 2, where the platform API owns the sandbox lifecycle, Mode 3 is framework-driven: the agent framework exposes a sandbox extension point (such as the OpenAI Agents SDK sandbox extensions), and the developer wires up a provider, manages sessions, and controls teardown. This is a reasonable starting point for teams adding sandboxing to existing agent deployments, but the agent process itself remains exposed. For enterprise agents with access to production systems, plan to graduate to Mode 1 or Mode 2.
The question for any enterprise platform is whether it can cover all 3. Red Hat AI with OpenShell can, and here is how we validated that.
OpenShell: kernel-enforced agent sandboxing
NVIDIA's OpenShell provides 5 layers of kernel-enforced defense: Landlock file system restrictions, seccomp system call filtering, SELinux mandatory access controls, user namespace isolation, network namespace isolation, per-binary Open Policy Agent (OPA)/Rego network policy, and L7 HTTP inspection through Transport Layer Security (TLS) interception.
The process-aware policy and credential isolation are what set this apart. OpenShell applies intent-oriented policies that map down to the process level, tracking which process is making each outbound connection and using SHA-256 verification to build a coherent audit trail. A policy can express "the agent runtime should reach api.github.com, but other processes in the sandbox should not." At the same time, credentials are never stored inside the sandbox. OpenShell's inference routing proxy injects them at the network boundary, so even a compromised agent holds nothing to exfiltrate. The combination significantly reduces the attack surface.
OpenShell's OpenShift driver runs each agent as a Kubernetes pod with full policy enforcement. Deploy it on your cluster, create a sandbox, run your agent inside it. No additional integration required.
OpenShell is planned for integration into Red Hat AI, which would give enterprise teams a secure agent runtime as a native platform capability, the same way Red Hat OpenShift AI provides model serving and inference today.
Putting it to the test: OpenShell across agent platforms, frameworks, and agentic APIs
To validate that OpenShell can serve as a universal sandbox backend, we built reference implementations against the 3 most widely adopted agent APIs. Each maps to a different sandboxing mode, demonstrating that the same enforcement layer adapts to whatever execution model the enterprise chooses.
Anthropic self-hosted sandboxes (Mode 2)
Anthropic's Claude Managed Agents introduced support for self-hosted environments, another Mode 2 pattern. Anthropic hosts the agent orchestration and reasoning layer. The customer hosts the execution environment. Agent-generated tool calls execute inside isolated sandbox runtimes controlled by the customer.
We have a working demo where a self-hosted environment worker polls Anthropic for tasks and executes them inside OpenShell sandboxes. We validated this on both the Podman driver (local development) and the OpenShift driver (Kubernetes cluster), confirming that the same architecture works from laptop to production:
source ~/.ant-env
ant beta:environments create --name self-hosted \
--config '{"type": "self_hosted"}'
ant beta:agents create --name secure-agent \
--model claude-sonnet-4-6
ant beta:worker poll --workdir /workspaceEach session gets its own OpenShell sandbox. The agent's code, file system, and network egress never leave your infrastructure. The worker spawns a sandbox per session, passes in only the environment key (never your API key), and tears it down when done.
Figure 2: Claude Managed Agents self-hosted environment, active
Figure 3: Claude Managed Agents, active agents running claude-sonnet-4-6
Figure 4: Claude Managed Agents sessions, idle sessions linked to test-agent-with-tools
This creates a clean separation between cloud-hosted reasoning and customer-controlled execution, which aligns closely with enterprise requirements for data sovereignty and governance.
Responses API with the Containers API (Mode 2)
The Responses API is quickly becoming the standard agent execution contract. It is not limited to the OpenAI Agents SDK. Codex, LangChain, and a growing number of agent frameworks use it as their primary interface for sandboxed code execution. The API creates sandboxes, executes commands, and feeds output back to the model in a loop. This is a Mode 2 pattern: the platform orchestrates the agent, and the sandbox is the execution environment.
The Responses API implementation in the OGX project is already mature and OpenResponses-compliant, providing the true open source version of this contract. OGX uses vLLM as the inference backend for open source and self-hosted models, so the entire stack from model serving to agent execution runs on infrastructure you control. We are currently implementing the supplementary Containers API with OpenShell as a container provider, adding kernel-enforced isolation to the execution layer. From the developer's perspective, one API call:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8321/v1")
response = client.responses.create(
model="meta-llama/Llama-4-Maverick-17B-128E",
tools=[{
"type": "shell",
"environment": {"type": "container_auto"},
}],
input="Analyze the CSV and plot the results.",
)Under the hood, container_auto creates an OpenShell sandbox, runs the commands inside it, and tears it down when the model finishes. Credentials are injected as domain secrets at the network boundary. The model can call an API but cannot read or exfiltrate the key.
For persistent workloads, container_reference keeps the sandbox alive across multiple turns. Network policy is a first-class concept, you can create a container with an allowlist of reachable domains or disable outbound access entirely.
The developer writes zero sandbox code. The platform handles the lifecycle. That is the Mode 2 value proposition, and OpenShell provides the enforcement underneath. Because the Responses API implementation is open source through OGX, any agent framework that speaks this contract can get OpenShell isolation on top of vLLM-served models.
OpenAI Agents SDK sandbox extensions (Mode 3)
The OpenAI Agents SDK also ships a sandbox agents feature with pluggable sandbox clients for isolating code execution. This is a Mode 3 pattern: the agent itself runs unsandboxed, but the code it generates executes in an isolated environment.
We built an OpenShell client and contributed it upstream as a sandbox provider that drops in alongside the existing Unix local and Docker backends:
from agents.extensions.sandbox.openshell import (
OpenShellSandboxClient,
OpenShellSandboxClientOptions,
)
result = await Runner.run(
agent,
"Fix the bug and run the tests.",
run_config=RunConfig(
sandbox=SandboxRunConfig(
client=OpenShellSandboxClient(),
options=OpenShellSandboxClientOptions(
cluster="my-openshell-cluster",
),
),
),
)The OpenShell client handles sandbox creation, command execution over gRPC, file I/O, workspace persistence, session resume, and teardown. Every file path is validated against the workspace root before execution. OpenShell enforces its own file system policy inside the sandbox as a second layer.
We validated this end-to-end with self-hosted models (Qwen3-8B, Granite, Llama 3.2) served by vLLM on a Red Hat OpenShift AI cluster, with agents executing inside OpenShell sandboxes on the same infrastructure. No external API calls, no data leaving the environment.
Mode 3 is a reasonable starting point. For enterprise agents with access to production systems, we recommend graduating to Mode 1 or Mode 2 as the deployment matures.
Mode 1: Already works out of the box
OpenShell on Red Hat OpenShift delivers Mode 1 without additional integration. The OpenShift driver creates a sandboxed pod, applies all five enforcement layers, and the agent runs fully contained. This is the right default for developer tooling, CI/CD pipelines, and any environment where the host holds credentials that the agent should never see.
The next step for taking agents to production
Historically, AI platforms focused on model serving and inference. AI agents introduce a different challenge, enabling models to safely take action.
That requires a new class of infrastructure, one that prioritizes:
- Secure execution runtimes
- Credential isolation
- Policy enforcement
- Runtime audit controls.
These are not features to add later. They are the requirements that separate a demo from a production deployment. The combination of Kubernetes-native orchestration through Red Hat OpenShift AI and kernel-enforced sandbox isolation through OpenShell gives enterprise teams a foundation for deploying AI agents that are both capable and governable, regardless of which agent platform or sandboxing mode they choose.
We validated this across the 3 most widely adopted agent APIs. The same OpenShell enforcement layer works whether the platform manages the sandbox (Mode 2), the developer manages the sandbox (Mode 3), or the entire agent runs inside one (Mode 1). That flexibility is what makes this a platform capability, not a point solution.
What comes next
The reference implementations described in this post are early validations, not shipping product features yet. They represent what we learned by plugging OpenShell into every major agent platform and confirming that the same enforcement layer works across all three sandboxing modes (code is available here for reference).
What we are working toward is making secure agent execution a native capability of Red Hat AI, so that enterprise teams do not have to build this integration themselves. The patterns are proven. The work to productize them is underway.
In the meantime, start by auditing your agent's sphere of influence. What can it access? Credentials, databases, internal services? The list is always longer than you expect. Understanding your mode is the first step toward choosing the right isolation boundary.
Whichever agent framework or runtime or harness you adopt and wherever you deploy, a secure execution environment should not be an afterthought. With Red Hat AI and OpenShell, it won't be.
Resource
The adaptable enterprise: Why AI readiness is disruption readiness
About the authors
Adel Zaalouk is a product manager at Red Hat who enjoys blending business and technology to achieve meaningful outcomes. He has experience working in research and industry, and he's passionate about Red Hat OpenShift, cloud, AI and cloud-native technologies. He's interested in how businesses use OpenShift to solve problems, from helping them get started with containerization to scaling their applications to meet demand.
Derek Carr is a Senior Distinguished Engineer at Red Hat. He has worked in the Kubernetes and related cloud-native open source communities since 2014.
Mrunal is a Distinguished Engineer at Red Hat working on containers, kubernetes and open source since 2011.
Joe Fernandes is Vice President and General Manager of the Artificial Intelligence (AI) Business Unit at Red Hat, where he leads product management, product marketing, and technical marketing for Red Hat's AI platforms, including Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI.
More like this
What even is the harness in AI?
Bringing Claude self-hosted sandboxes to OpenShell on Red Hat AI
Technically Speaking | Build a production-ready AI toolbox
Technically Speaking | Platform engineering for AI agents
Browse by channel
Automation
The latest on IT automation for tech, teams, and environments
Artificial intelligence
Updates on the platforms that free customers to run AI workloads anywhere
Open hybrid cloud
Explore how we build a more flexible future with hybrid cloud
Security
The latest on how we reduce risks across environments and technologies
Edge computing
Updates on the platforms that simplify operations at the edge
Infrastructure
The latest on the world’s leading enterprise Linux platform
Applications
Inside our solutions to the toughest application challenges
Virtualization
The future of enterprise virtualization for your workloads on-premise or across clouds