Agentuity Documentation

Execute code in isolated Linux containers with configurable resource limits, network controls, and execution timeouts.

Why Sandboxes?

Agents that reason about code need somewhere safe to execute it. Whether generating Python scripts, validating builds, or running user-provided code, you can't let arbitrary execution happen on your infrastructure.

The pattern keeps repeating: spin up a secure environment, run code, tear it down. Without proper isolation, a single bad script could access sensitive data, exhaust resources, or compromise your systems.

Agentuity sandboxes handle the isolation layer. One-shot runs create a sandbox, execute a command, and destroy it. Interactive sandboxes keep their filesystem until you destroy them or the idle timeout reaps them.

What this gives you:

Security by default: Network disabled, filesystem isolated, resource limits enforced
No infrastructure management: Containers spin up and tear down automatically
Multi-language support: Run Python, Node.js, shell scripts, and more
Consistent environments: Use snapshots to get the same setup every time, with dependencies pre-installed

Three Ways to Use Sandboxes

Method	Best For
Web App	Visual management, browsing runtimes and snapshots
SDK	Programmatic use in agents and routes (`ctx.sandbox`)
CLI	Local development, scripting, CI/CD

Multi-language Execution

Your agents are written in TypeScript, but the sandbox can run any language safely. Use ctx.sandbox.run() to execute Python, Node.js, shell scripts, or anything available via apt install in isolated containers.

Key Concepts

Concept	Description
Runtime	A pre-configured base environment (OS + language tools) provided by Agentuity
Sandbox	A running container created from a runtime where you execute commands
Snapshot	A saved sandbox state that can be used to create new sandboxes
Checkpoint	A saved filesystem state for one sandbox, used by pause/resume and restore workflows

Runtimes, sandboxes, and snapshots build on each other: Runtime → Sandbox → Snapshot. Checkpoints are sandbox-scoped: you restore the same sandbox back to a saved filesystem state instead of creating a reusable base image.

Pick a runtime (e.g., bun:1 or node:latest)
Create a sandbox from that runtime
Optionally save a snapshot to reuse your configured environment

Runtimes

Runtimes are pre-configured base environments that Agentuity provides. Each includes an operating system, language toolchain, and common utilities.

Language Runtimes

Use these for general code execution:

Runtime	Description
`base:latest`	Minimal base runtime with essential tools (default)
`bun:1`	Bun 1.x with JavaScript/TypeScript support
`node:latest`	Node.js latest version
`node:lts`	Node.js LTS version
`python:3.13`	Python 3.13 with uv package manager
`python:3.14`	Python 3.14 with uv package manager

Agent Runtimes

Pre-configured AI coding assistants:

Runtime	Description
`claude-code:latest`	Claude Code AI assistant
`amp:latest`	Amp AI coding assistant
`opencode:latest`	OpenCode AI coding assistant

List Available Runtimes

Run agentuity cloud sandbox runtime list to see all available runtimes, or view them in the Web App under Services > Sandbox > Runtimes.

Runtime Metadata

Each runtime includes metadata for identification and resource planning:

Field	Description
`description`	What the runtime provides
`iconUrl`	URL to runtime icon
`brandColor`	Hex color for UI display
`url`	Link to runtime documentation or homepage
`tags`	Categories like `language`, `testing`, `agent`
`requirements`	Minimum memory, CPU, disk, and `networkEnabled` requirements

View runtime details with agentuity cloud sandbox runtime list --json.

Snapshots

A snapshot captures the filesystem state of a sandbox. You create new sandboxes from a snapshot rather than running it directly.

Snapshots build on top of runtimes. When you create a snapshot, it includes everything from the base runtime plus your installed dependencies and files.

Workflow:

Create a sandbox from a runtime
Install dependencies and configure the environment
Save a snapshot
Create new sandboxes from that snapshot (fast, no reinstallation needed)

See Creating and Using Snapshots for details.

Two Execution Modes

Choose based on your use case:

One-shot (`sandbox.run()`)

Creates a sandbox, runs a single command, then destroys the sandbox. Best for stateless code execution.

import { createAgent } from '@agentuity/runtime';
 
const agent = createAgent('CodeRunner', {
  handler: async (ctx, input) => {
    const result = await ctx.sandbox.run({
      command: { exec: ['python3', '-c', 'print("Hello!")'] },
      resources: { memory: '256Mi', cpu: '500m' },
    });
 
    ctx.logger.info('Output', { stdout: result.stdout, exitCode: result.exitCode });
    return { output: result.stdout, exitCode: result.exitCode };
  },
});

Interactive (`sandbox.create()`)

Creates a persistent sandbox for multiple commands. Best for stateful workflows like dependency installation.

import { createAgent } from '@agentuity/runtime';
 
const agent = createAgent('ProjectBuilder', {
  handler: async (ctx, input) => {
    const sandbox = await ctx.sandbox.create({
      runtime: 'node:lts',
      resources: { memory: '1Gi' },
      network: { enabled: true },  // Required for package installation
    });
 
    try {
      await sandbox.execute({ command: ['npm', 'init', '-y'] });
      await sandbox.execute({ command: ['npm', 'install', 'zod'] });
      const result = await sandbox.execute({
        command: ['node', '-e', 'console.log("ready")'],
      });
 
      return { exitCode: result.exitCode };
    } finally {
      await sandbox.destroy();
    }
  },
});

Background Jobs

Jobs let you run long-running commands in a sandbox without blocking. Unlike regular execution, jobs:

Run in parallel: Multiple jobs can execute simultaneously
Don't block: Control returns immediately after creation
Persist: Jobs continue even after the creating request completes
Capture output: Stdout/stderr are captured to streams for later retrieval

Creating Jobs

import { createAgent } from '@agentuity/runtime';
 
const agent = createAgent('JobRunner', {
  handler: async (ctx, input) => {
    const sandbox = await ctx.sandbox.create({
      runtime: 'node:lts',
      resources: { memory: '2Gi' },
      network: { enabled: true },
    });
 
    // Create a background job
    const job = await sandbox.createJob({
      command: ['sh', '-c', 'sleep 30 && echo done'],
    });
 
    ctx.logger.info('Job started', { jobId: job.jobId });
 
    // Check status later
    const status = await sandbox.getJob(job.jobId);
    if (status.status === 'completed') {
      ctx.logger.info('Job completed', { exitCode: status.exitCode });
    }
 
    return { jobId: job.jobId };
  },
});

Job Lifecycle

Status	Description
`pending`	Job created, waiting to start
`running`	Job actively executing
`completed`	Finished with exit code 0
`failed`	Finished with non-zero exit code
`cancelled`	Terminated by user request

Stopping Jobs

// Graceful stop (SIGTERM, then SIGKILL after grace period)
await sandbox.stopJob(job.jobId);
 
// Force kill immediately
await sandbox.stopJob(job.jobId, true);

Use Cases

Use Case	Example
Build processes	Run `npm run build` in background
Long-running tests	Execute test suites without blocking
Data processing	Process large files asynchronously
Service daemons	Run background services in sandbox

SDK Access

Context	Access
Agents	`ctx.sandbox`
Routes	`c.var.sandbox`

The API is identical in both contexts.

Configuration Options

Option	Description	Example
`runtime`	Runtime environment	`'bun:1'`, `'python:3.14'`
`resources.memory`	Memory limit (Kubernetes-style)	`'512Mi'`, `'1Gi'`
`resources.cpu`	CPU limit in millicores	`'500m'`, `'1000m'`
`resources.disk`	Disk space limit	`'1Gi'`
`network.enabled`	Allow outbound network	`true` (default: `false`)
`network.port`	Port to expose to internet (1024-65535)	`3000`
`projectId`	Associate sandbox with a project	`'proj_abc123'`
`timeout.idle`	Idle timeout before cleanup	`'10m'`, `'1h'`
`timeout.execution`	Max execution time per command	`'5m'`, `'30s'`
`dependencies`	Apt packages to install	`['python3', 'git']`
`packages`	npm/bun packages to install globally	`['typescript', 'tsx']`
`env`	Environment variables	`{ NODE_ENV: 'test' }`
`snapshot`	Create from existing snapshot	`'my-env'` or `snp_abc123`

Sandbox Events

Every sandbox records lifecycle events as it transitions through states. Use sandboxEventList to retrieve these events for auditing or debugging.

Server Utility

sandboxEventList is a server-side function from @agentuity/server. It requires an APIClient instance, typically used in CLI tools or standalone scripts rather than agent handlers.

import { sandboxEventList } from '@agentuity/server';
 
const { events } = await sandboxEventList(client, {
  sandboxId: 'sbx_abc123',
  limit: 50,            // optional, default 50
  direction: 'asc',    // optional: 'asc' (oldest first, default) or 'desc'
});

Each event includes:

Field	Description
`eventId`	Unique identifier for the event
`sandboxId`	ID of the sandbox
`type`	Event type (e.g., `create`, `destroy`, `lifecycle:started`)
`event`	Arbitrary payload data for the event
`createdAt`	ISO timestamp when the event was recorded

From the CLI, use agentuity cloud sandbox events <sandbox-id> to list events. See CLI Commands for options.

Resume Paused Sandboxes

sandbox.execute() automatically resumes a suspended sandbox and returns autoResumed: true on the execution result. Call sandbox.resume() when you want the sandbox awake before issuing a batch of commands.

await sandbox.resume();
const execution = await sandbox.execute({ command: ['bun', 'run', 'test'] });
 
ctx.logger
  .child({ executionId: execution.executionId, autoResumed: execution.autoResumed })
  .info('Sandbox command completed');

Resume before batches

Use explicit sandbox.resume() (or agentuity cloud sandbox resume) when startup latency matters before the first command. For single commands, execute() can wake the sandbox for you.

When to Use Sandbox

Use Case	Example
Code execution agents	Run user-provided Python/JavaScript safely
Code validation	Verify generated code compiles and runs
AI coding assistants	Execute code suggested by LLMs
Automated testing	Run tests in clean environments
Build systems	Compile projects in isolated containers

Security

Sandboxes provide isolation through:

Network disabled by default: Enable explicitly when needed
Resource limits: Prevent resource exhaustion
Execution timeouts: Prevent runaway processes
Filesystem isolation: Each sandbox has its own workspace

Next Steps

SDK Usage: Detailed API for file I/O, streaming, and advanced configuration
Snapshots: Skip dependency installation with pre-configured environments
CLI Commands: Debug sandboxes and create snapshots manually