Execute code in isolated Linux containers with configurable resource limits, network controls, and execution timeouts.
Why Sandboxes?
Agents that reason about code need somewhere safe to execute it. Whether generating Python scripts, validating builds, or running user-provided code, you can't let arbitrary execution happen on your infrastructure.
The pattern keeps repeating: spin up a secure environment, run code, tear it down. Without proper isolation, a single bad script could access sensitive data, exhaust resources, or compromise your systems.
Agentuity sandboxes handle this automatically. Every execution runs in an isolated container with its own filesystem, configurable resource limits, and network controls. When execution completes, the environment is destroyed.
What this gives you:
- Security by default: Network disabled, filesystem isolated, resource limits enforced
- No infrastructure management: Containers spin up and tear down automatically
- Multi-language support: Run Python, Node.js, shell scripts, and more
- Consistent environments: Use snapshots to get the same setup every time, with dependencies pre-installed
Three Ways to Use Sandboxes
| Method | Best For |
|---|---|
| Web App | Visual management, browsing runtimes and snapshots |
| SDK | Programmatic use in agents and routes (ctx.sandbox) |
| CLI | Local development, scripting, CI/CD |
Your agents are written in TypeScript, but the sandbox can run any language safely. Use ctx.sandbox.run() to execute Python, Node.js, shell scripts, or anything available via apt install in isolated containers.
Key Concepts
| Concept | Description |
|---|---|
| Runtime | A pre-configured base environment (OS + language tools) provided by Agentuity |
| Sandbox | A running container created from a runtime where you execute commands |
| Snapshot | A saved sandbox state that can be used to create new sandboxes |
These build on each other: Runtime → Sandbox → Snapshot. Here's an example of how to use all three:
- Pick a runtime (e.g.,
bun:1ornode:latest) - Create a sandbox from that runtime
- Optionally save a snapshot to reuse your configured environment
Runtimes
Runtimes are pre-configured base environments that Agentuity provides. Each includes an operating system, language toolchain, and common utilities.
Language Runtimes
Use these for general code execution:
| Runtime | Description |
|---|---|
base:latest | Minimal Debian runtime with essential tools (default) |
bun:1 | Bun 1.x with JavaScript/TypeScript support |
node:latest | Node.js latest version |
node:lts | Node.js LTS version |
python:3.13 | Python 3.13 |
python:3.14 | Python 3.14 |
golang:latest | Golang latest |
Agent Runtimes
Pre-configured AI coding assistants:
| Runtime | Description |
|---|---|
claude-code:latest | Claude Code AI assistant |
amp:latest | Amp AI coding assistant |
codex:latest | OpenAI Codex CLI agent |
gemini-cli:latest | Google Gemini CLI agent |
opencode:latest | OpenCode AI coding assistant |
agentuity:latest | Agentuity CLI for building and running AI agents |
Testing Runtimes
Pre-configured testing runtimes:
| Runtime | Description |
|---|---|
agent-browser:latest | Headless browser automation CLI for AI agents |
playwright:v1 | Playwright browser automation runtime (Chrome, Firefox, WebKit) |
Run agentuity cloud sandbox runtime list to see all available runtimes, or view them in the Web App under Services > Sandbox > Runtimes.
Runtime Metadata
Each runtime includes metadata for identification and resource planning:
| Field | Description |
|---|---|
description | What the runtime provides |
icon | URL to runtime icon |
brandColor | Hex color for UI display |
documentationUrl | Link to runtime documentation |
tags | Categories like language, ai-agent |
requirements | Minimum memory, CPU, disk, and network needs |
View runtime details with agentuity cloud sandbox runtime list --json.
Snapshots
A snapshot captures the filesystem state of a sandbox. You create new sandboxes from a snapshot rather than running it directly.
Snapshots build on top of runtimes. When you create a snapshot, it includes everything from the base runtime plus your installed dependencies and files.
Workflow:
- Create a sandbox from a runtime
- Install dependencies and configure the environment
- Save a snapshot
- Create new sandboxes from that snapshot (fast, no reinstallation needed)
See Creating and Using Snapshots for details.
Two Execution Modes
Choose based on your use case:
One-shot (sandbox.run())
Creates a sandbox, runs a single command, then destroys the sandbox. Best for stateless code execution.
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('CodeRunner', {
handler: async (ctx, input) => {
const result = await ctx.sandbox.run({
command: { exec: ['python3', '-c', 'print("Hello!")'] },
resources: { memory: '256Mi', cpu: '500m' },
});
ctx.logger.info('Output', { stdout: result.stdout, exitCode: result.exitCode });
return { output: result.stdout, exitCode: result.exitCode };
},
});Interactive (sandbox.create())
Creates a persistent sandbox for multiple commands. Best for stateful workflows like dependency installation.
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('ProjectBuilder', {
handler: async (ctx, input) => {
const sandbox = await ctx.sandbox.create({
resources: { memory: '1Gi' },
network: { enabled: true }, // Required for package installation
});
try {
await sandbox.execute({ command: ['npm', 'install'] });
await sandbox.execute({ command: ['npm', 'run', 'build'] });
return { success: true };
} finally {
await sandbox.destroy();
}
},
});Background Jobs
Jobs let you run long-running commands in a sandbox without blocking. Unlike regular execution, jobs:
- Run in parallel: Multiple jobs can execute simultaneously
- Don't block: Control returns immediately after creation
- Persist: Jobs continue even after the creating request completes
- Capture output: Stdout/stderr are captured to streams for later retrieval
Creating Jobs
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('BuildRunner', {
handler: async (ctx, input) => {
const sandbox = await ctx.sandbox.create({
resources: { memory: '2Gi' },
network: { enabled: true },
});
// Create a background job
const job = await sandbox.jobCreate({
command: ['npm', 'run', 'build'],
});
ctx.logger.info('Build started', { jobId: job.jobId });
// Check status later
const status = await sandbox.jobGet({ jobId: job.jobId });
if (status.status === 'completed') {
ctx.logger.info('Build succeeded', { exitCode: status.exitCode });
}
return { jobId: job.jobId };
},
});Job Lifecycle
| Status | Description |
|---|---|
pending | Job created, waiting to start |
running | Job actively executing |
completed | Finished with exit code 0 |
failed | Finished with non-zero exit code |
cancelled | Terminated by user request |
Stopping Jobs
// Graceful stop (SIGTERM, then SIGKILL after grace period)
await sandbox.jobStop({ jobId: job.jobId });
// Force kill immediately
await sandbox.jobStop({ jobId: job.jobId, force: true });Use Cases
| Use Case | Example |
|---|---|
| Build processes | Run npm run build in background |
| Long-running tests | Execute test suites without blocking |
| Data processing | Process large files asynchronously |
| Service daemons | Run background services in sandbox |
SDK Access
| Context | Access |
|---|---|
| Agents | ctx.sandbox |
| Routes | c.var.sandbox |
The API is identical in both contexts.
Configuration Options
| Option | Description | Example |
|---|---|---|
runtime | Runtime environment | 'bun:1', 'python:3.14' |
resources.memory | Memory limit (Kubernetes-style) | '512Mi', '1Gi' |
resources.cpu | CPU limit in millicores | '500m', '1000m' |
resources.disk | Disk space limit | '1Gi' |
network.enabled | Allow outbound network | true (default: false) |
network.port | Port to expose to internet (1024-65535) | 3000 |
projectId | Associate sandbox with a project | 'proj_abc123' |
timeout.idle | Idle timeout before cleanup | '10m', '1h' |
timeout.execution | Max execution time per command | '5m', '30s' |
dependencies | Apt packages to install | ['python3', 'git'] |
env | Environment variables | { NODE_ENV: 'test' } |
snapshot | Create from existing snapshot | 'my-env' or snp_abc123 |
Sandbox Events
Every sandbox records lifecycle events as it transitions through states. Use sandboxEventList to retrieve these events for auditing or debugging.
sandboxEventList is a server-side function from @agentuity/server. It requires an APIClient instance, typically used in CLI tools or standalone scripts rather than agent handlers.
import { sandboxEventList } from '@agentuity/server';
const { events } = await sandboxEventList(client, {
sandboxId: 'sbx_abc123',
limit: 50, // optional, default 50
direction: 'asc', // optional: 'asc' (oldest first, default) or 'desc'
});Each event includes:
| Field | Description |
|---|---|
eventId | Unique identifier for the event |
sandboxId | ID of the sandbox |
type | Event type (e.g., create, destroy, lifecycle:started) |
event | Arbitrary payload data for the event |
createdAt | ISO timestamp when the event was recorded |
From the CLI, use agentuity cloud sandbox events <sandbox-id> to list events. See CLI Commands for options.
Auto-Resume on Execute
When you call exec on a suspended sandbox, the sandbox automatically resumes before running the command. The response includes autoResumed: true so you can detect and log this behavior.
const execution = await sandbox.execute({ command: ['bun', 'run', 'test'] });
if (execution.autoResumed) {
ctx.logger.info('Sandbox was resumed automatically before execution');
}Auto-resume is transparent: the command runs normally after the sandbox wakes up. Use explicit sandbox.resume() (or agentuity cloud sandbox resume) if you need the sandbox ready before issuing multiple commands.
When to Use Sandbox
| Use Case | Example |
|---|---|
| Code execution agents | Run user-provided Python/JavaScript safely |
| Code validation | Verify generated code compiles and runs |
| AI coding assistants | Execute code suggested by LLMs |
| Automated testing | Run tests in clean environments |
| Build systems | Compile projects in isolated containers |
Security
Sandboxes provide isolation through:
- Network disabled by default: Enable explicitly when needed
- Resource limits: Prevent resource exhaustion
- Execution timeouts: Prevent runaway processes
- Filesystem isolation: Each sandbox has its own workspace
Next Steps
- SDK Usage: Detailed API for file I/O, streaming, and advanced configuration
- Snapshots: Skip dependency installation with pre-configured environments
- CLI Commands: Debug sandboxes and create snapshots manually