Execute code in isolated Linux containers with configurable resource limits, network controls, and execution timeouts.
Why Sandboxes?
Agents that reason about code need somewhere safe to execute it. Whether generating Python scripts, validating builds, or running user-provided code, you can't let arbitrary execution happen on your infrastructure.
The pattern keeps repeating: spin up a secure environment, run code, tear it down. Without proper isolation, a single bad script could access sensitive data, exhaust resources, or compromise your systems.
Agentuity sandboxes handle the isolation layer. One-shot runs create a sandbox, execute a command, and destroy it. Interactive sandboxes keep their filesystem until you destroy them or the idle timeout reaps them.
What this gives you:
- Security by default: Network disabled, filesystem isolated, resource limits enforced
- No infrastructure management: Containers spin up and tear down automatically
- Multi-language support: Run Python, Node.js, shell scripts, and more
- Consistent environments: Use snapshots to get the same setup every time, with dependencies pre-installed
Three Ways to Use Sandboxes
| Method | Best For |
|---|---|
| Web App | Visual management, browsing runtimes and snapshots |
| SDK | Programmatic use in agents and routes (ctx.sandbox) |
| CLI | Local development, scripting, CI/CD |
Your agents are written in TypeScript, but the sandbox can run any language safely. Use ctx.sandbox.run() to execute Python, Node.js, shell scripts, or anything available via apt install in isolated containers.
Key Concepts
| Concept | Description |
|---|---|
| Runtime | A pre-configured base environment (OS + language tools) provided by Agentuity |
| Sandbox | A running container created from a runtime where you execute commands |
| Snapshot | A saved sandbox state that can be used to create new sandboxes |
| Checkpoint | A saved filesystem state for one sandbox, used by pause/resume and restore workflows |
Runtimes, sandboxes, and snapshots build on each other: Runtime → Sandbox → Snapshot. Checkpoints are sandbox-scoped: you restore the same sandbox back to a saved filesystem state instead of creating a reusable base image.
- Pick a runtime (e.g.,
bun:1ornode:latest) - Create a sandbox from that runtime
- Optionally save a snapshot to reuse your configured environment
Runtimes
Runtimes are pre-configured base environments that Agentuity provides. Each includes an operating system, language toolchain, and common utilities.
Language Runtimes
Use these for general code execution:
| Runtime | Description |
|---|---|
base:latest | Minimal base runtime with essential tools (default) |
bun:1 | Bun 1.x with JavaScript/TypeScript support |
node:latest | Node.js latest version |
node:lts | Node.js LTS version |
python:3.13 | Python 3.13 with uv package manager |
python:3.14 | Python 3.14 with uv package manager |
Agent Runtimes
Pre-configured AI coding assistants:
| Runtime | Description |
|---|---|
claude-code:latest | Claude Code AI assistant |
amp:latest | Amp AI coding assistant |
opencode:latest | OpenCode AI coding assistant |
Run agentuity cloud sandbox runtime list to see all available runtimes, or view them in the Web App under Services > Sandbox > Runtimes.
Runtime Metadata
Each runtime includes metadata for identification and resource planning:
| Field | Description |
|---|---|
description | What the runtime provides |
iconUrl | URL to runtime icon |
brandColor | Hex color for UI display |
url | Link to runtime documentation or homepage |
tags | Categories like language, testing, agent |
requirements | Minimum memory, CPU, disk, and networkEnabled requirements |
View runtime details with agentuity cloud sandbox runtime list --json.
Snapshots
A snapshot captures the filesystem state of a sandbox. You create new sandboxes from a snapshot rather than running it directly.
Snapshots build on top of runtimes. When you create a snapshot, it includes everything from the base runtime plus your installed dependencies and files.
Workflow:
- Create a sandbox from a runtime
- Install dependencies and configure the environment
- Save a snapshot
- Create new sandboxes from that snapshot (fast, no reinstallation needed)
See Creating and Using Snapshots for details.
Two Execution Modes
Choose based on your use case:
One-shot (sandbox.run())
Creates a sandbox, runs a single command, then destroys the sandbox. Best for stateless code execution.
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('CodeRunner', {
handler: async (ctx, input) => {
const result = await ctx.sandbox.run({
command: { exec: ['python3', '-c', 'print("Hello!")'] },
resources: { memory: '256Mi', cpu: '500m' },
});
ctx.logger.info('Output', { stdout: result.stdout, exitCode: result.exitCode });
return { output: result.stdout, exitCode: result.exitCode };
},
});Interactive (sandbox.create())
Creates a persistent sandbox for multiple commands. Best for stateful workflows like dependency installation.
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('ProjectBuilder', {
handler: async (ctx, input) => {
const sandbox = await ctx.sandbox.create({
runtime: 'node:lts',
resources: { memory: '1Gi' },
network: { enabled: true }, // Required for package installation
});
try {
await sandbox.execute({ command: ['npm', 'init', '-y'] });
await sandbox.execute({ command: ['npm', 'install', 'zod'] });
const result = await sandbox.execute({
command: ['node', '-e', 'console.log("ready")'],
});
return { exitCode: result.exitCode };
} finally {
await sandbox.destroy();
}
},
});Background Jobs
Jobs let you run long-running commands in a sandbox without blocking. Unlike regular execution, jobs:
- Run in parallel: Multiple jobs can execute simultaneously
- Don't block: Control returns immediately after creation
- Persist: Jobs continue even after the creating request completes
- Capture output: Stdout/stderr are captured to streams for later retrieval
Creating Jobs
import { createAgent } from '@agentuity/runtime';
const agent = createAgent('JobRunner', {
handler: async (ctx, input) => {
const sandbox = await ctx.sandbox.create({
runtime: 'node:lts',
resources: { memory: '2Gi' },
network: { enabled: true },
});
// Create a background job
const job = await sandbox.createJob({
command: ['sh', '-c', 'sleep 30 && echo done'],
});
ctx.logger.info('Job started', { jobId: job.jobId });
// Check status later
const status = await sandbox.getJob(job.jobId);
if (status.status === 'completed') {
ctx.logger.info('Job completed', { exitCode: status.exitCode });
}
return { jobId: job.jobId };
},
});Job Lifecycle
| Status | Description |
|---|---|
pending | Job created, waiting to start |
running | Job actively executing |
completed | Finished with exit code 0 |
failed | Finished with non-zero exit code |
cancelled | Terminated by user request |
Stopping Jobs
// Graceful stop (SIGTERM, then SIGKILL after grace period)
await sandbox.stopJob(job.jobId);
// Force kill immediately
await sandbox.stopJob(job.jobId, true);Use Cases
| Use Case | Example |
|---|---|
| Build processes | Run npm run build in background |
| Long-running tests | Execute test suites without blocking |
| Data processing | Process large files asynchronously |
| Service daemons | Run background services in sandbox |
SDK Access
| Context | Access |
|---|---|
| Agents | ctx.sandbox |
| Routes | c.var.sandbox |
The API is identical in both contexts.
Configuration Options
| Option | Description | Example |
|---|---|---|
runtime | Runtime environment | 'bun:1', 'python:3.14' |
resources.memory | Memory limit (Kubernetes-style) | '512Mi', '1Gi' |
resources.cpu | CPU limit in millicores | '500m', '1000m' |
resources.disk | Disk space limit | '1Gi' |
network.enabled | Allow outbound network | true (default: false) |
network.port | Port to expose to internet (1024-65535) | 3000 |
projectId | Associate sandbox with a project | 'proj_abc123' |
timeout.idle | Idle timeout before cleanup | '10m', '1h' |
timeout.execution | Max execution time per command | '5m', '30s' |
dependencies | Apt packages to install | ['python3', 'git'] |
packages | npm/bun packages to install globally | ['typescript', 'tsx'] |
env | Environment variables | { NODE_ENV: 'test' } |
snapshot | Create from existing snapshot | 'my-env' or snp_abc123 |
Sandbox Events
Every sandbox records lifecycle events as it transitions through states. Use sandboxEventList to retrieve these events for auditing or debugging.
sandboxEventList is a server-side function from @agentuity/server. It requires an APIClient instance, typically used in CLI tools or standalone scripts rather than agent handlers.
import { sandboxEventList } from '@agentuity/server';
const { events } = await sandboxEventList(client, {
sandboxId: 'sbx_abc123',
limit: 50, // optional, default 50
direction: 'asc', // optional: 'asc' (oldest first, default) or 'desc'
});Each event includes:
| Field | Description |
|---|---|
eventId | Unique identifier for the event |
sandboxId | ID of the sandbox |
type | Event type (e.g., create, destroy, lifecycle:started) |
event | Arbitrary payload data for the event |
createdAt | ISO timestamp when the event was recorded |
From the CLI, use agentuity cloud sandbox events <sandbox-id> to list events. See CLI Commands for options.
Resume Paused Sandboxes
sandbox.execute() automatically resumes a suspended sandbox and returns autoResumed: true on the execution result. Call sandbox.resume() when you want the sandbox awake before issuing a batch of commands.
await sandbox.resume();
const execution = await sandbox.execute({ command: ['bun', 'run', 'test'] });
ctx.logger
.child({ executionId: execution.executionId, autoResumed: execution.autoResumed })
.info('Sandbox command completed');Use explicit sandbox.resume() (or agentuity cloud sandbox resume) when startup latency matters before the first command. For single commands, execute() can wake the sandbox for you.
When to Use Sandbox
| Use Case | Example |
|---|---|
| Code execution agents | Run user-provided Python/JavaScript safely |
| Code validation | Verify generated code compiles and runs |
| AI coding assistants | Execute code suggested by LLMs |
| Automated testing | Run tests in clean environments |
| Build systems | Compile projects in isolated containers |
Security
Sandboxes provide isolation through:
- Network disabled by default: Enable explicitly when needed
- Resource limits: Prevent resource exhaustion
- Execution timeouts: Prevent runaway processes
- Filesystem isolation: Each sandbox has its own workspace
Next Steps
- SDK Usage: Detailed API for file I/O, streaming, and advanced configuration
- Snapshots: Skip dependency installation with pre-configured environments
- CLI Commands: Debug sandboxes and create snapshots manually