Using the Sandbox API

Create, execute, inspect, and clean up sandboxes with SandboxClient

Use SandboxClient when a script, route, worker, or agent process needs to create and control sandboxes.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient({
  apiKey: process.env.AGENTUITY_SDK_KEY,
});

apiKey is optional when AGENTUITY_SDK_KEY or AGENTUITY_CLI_KEY is already set.

Choose a Lifecycle

WorkflowStart WithCheck Before You Treat It as Done
one bounded command with captured outputclient.run()exitCode, stdout, stderr, timeout, and any app-level output contract
multiple commands against the same filesclient.create() plus sandbox.execute()each execution status, exit code, and cleanup in a finally block
a server or daemon that outlives the requestclient.createJob()job status, readiness probe, stream URLs, and stop/destroy behavior
a workspace you may resume laterpause(), resume(), and checkpointspaused timeout, terminatesAt, and whether the next execution auto-resumed
repeatable environments with preinstalled filessnapshotssnapshot tag or ID, lineage, and whether new sandboxes omit runtime
repo-aware coding work with review and reconnectsCodersession output contract, session state, and event history

Sandboxes are process and filesystem primitives. Use them when your app should own the command, files, timeouts, output parsing, and cleanup. Use Coding agents in sandboxes when you need direct control over a coding-agent runtime, and use Coder when the work should be a managed session.

Runtime Catalog

runtime is a catalog name available to your org, not a TypeScript union. Call client.listRuntimes() in setup, admin tooling, or startup checks before storing a runtime name in a long-lived workflow.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
const { runtimes } = await client.listRuntimes({ limit: 50 });
 
logger.info('sandbox runtimes', {
  names: runtimes.map((runtime) => runtime.name),
});

Coding-agent images can appear in the same catalog as language runtimes. Runtime availability only proves the image is listed; provider credentials, network access, and tool permissions are separate checks.

One-Shot Execution

Use client.run() for a disposable sandbox. The client creates a one-shot sandbox, runs the command, captures output, and the sandbox is auto-destroyed after the command exits.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'python:3.14',
  command: {
    exec: ['python3', '-c', 'print(sum([1, 2, 3]))'],
  },
  resources: { memory: '256Mi', cpu: '500m' },
  timeout: { execution: '30s' },
});
 
logger.info('sandbox run finished', {
  exitCode: result.exitCode,
  stdout: result.stdout,
  stderr: result.stderr,
});

Write Files Before Running

Pass command.files to write files into the sandbox before the one-shot command starts.

import { Buffer } from 'node:buffer';
import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'bun:1',
  command: {
    exec: ['bun', 'run', 'index.ts'],
    files: [
      {
        path: 'index.ts',
        content: Buffer.from('process.stdout.write("hello from TypeScript")'),
      },
      {
        path: 'data.json',
        content: Buffer.from(JSON.stringify({ items: [1, 2, 3] })),
      },
    ],
  },
});
 
logger.info('sandbox output', { stdout: result.stdout });

Stream Output

The optional second argument to run() accepts Node streams and an AbortSignal.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const result = await client.run(
  {
    runtime: 'bun:1',
    command: { exec: ['bun', '-e', 'process.stderr.write("err"); process.stdout.write("out")'] },
  },
  {
    stdout: process.stdout,
    stderr: process.stderr,
  }
);
 
logger.info('sandbox run finished', { exitCode: result.exitCode });

If you pass stdin in the second argument, the client needs an API key value so it can create the backing stream.

Interactive Sandboxes

Use client.create() when the workflow needs persistent files or multiple commands.

import { Buffer } from 'node:buffer';
import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
const sandbox = await client.create({
  runtime: 'bun:1',
  resources: { memory: '1Gi', cpu: '1000m' },
  network: { enabled: true },
  timeout: { idle: '10m', execution: '2m' },
});
 
try {
  await sandbox.execute({ command: ['bun', 'init', '-y'] });
  await sandbox.execute({ command: ['bun', 'add', 'zod'] });
 
  await sandbox.writeFiles([
    {
      path: 'index.ts',
      content: Buffer.from(`
        import { z } from 'zod';
 
        const schema = z.object({ name: z.string() });
        process.stdout.write(schema.parse({ name: 'Ada' }).name);
      `),
    },
  ]);
 
  const execution = await sandbox.execute({
    command: ['bun', 'run', 'index.ts'],
  });
 
  logger.info('sandbox execution finished', {
    executionId: execution.executionId,
    exitCode: execution.exitCode,
  });
} finally {
  await sandbox.destroy();
}

File Operations

Interactive sandbox instances support file reads, writes, directory creation, listing, and removal.

import { Buffer } from 'node:buffer';
import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
const sandbox = await client.create({ runtime: 'bun:1' });
 
try {
  await sandbox.mkDir('src', true);
 
  const filesWritten = await sandbox.writeFiles([
    { path: 'src/index.ts', content: Buffer.from('process.stdout.write("ok")') },
  ]);
 
  const files = await sandbox.listFiles('src');
  const stream = await sandbox.readFile('src/index.ts');
  const source = await readText(stream);
 
  logger.info('sandbox files', { filesWritten, files, source });
} finally {
  await sandbox.destroy();
}
 
async function readText(stream: ReadableStream<Uint8Array>): Promise<string> {
  const reader = stream.getReader();
  const decoder = new TextDecoder();
  let output = '';
 
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    output += decoder.decode(value, { stream: true });
  }
 
  return output + decoder.decode();
}

Remove files and directories with rmFile() and rmDir():

import { logger } from '@agentuity/telemetry';
 
const removedFile = await sandbox.rmFile('src/index.ts');
const removedDir = await sandbox.rmDir('src', true);
 
logger.info('sandbox files removed', {
  fileExisted: removedFile.found,
  dirExisted: removedDir.found,
});

Reading Execution Output

sandbox.execute() waits for the command to finish and returns stream URLs when output was captured. Fetch those URLs when you need stdout or stderr after the command completes.

import { logger } from '@agentuity/telemetry';
 
const execution = await sandbox.execute({
  command: ['bun', '-e', 'process.stdout.write("build output")'],
});
 
const stdout = execution.stdoutStreamUrl
  ? await readOutputUrl(execution.stdoutStreamUrl)
  : '';
 
logger.info('sandbox output fetched', {
  exitCode: execution.exitCode,
  stdout,
});
 
async function readOutputUrl(url: string): Promise<string> {
  const response = await fetch(url);
 
  if (!response.ok || !response.body) {
    throw new Error(`Unable to read sandbox output: ${response.status}`);
  }
 
  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let output = '';
 
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    output += decoder.decode(value, { stream: true });
  }
 
  return output + decoder.decode();
}

Streaming Execution Output

Use pipe in execute() to stream stdout or stderr directly to Node writable streams instead of buffering them for later fetch.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
const sandbox = await client.create({ runtime: 'bun:1' });
 
try {
  const execution = await sandbox.execute({
    command: ['bun', '-e', 'process.stdout.write("line 1"); process.stderr.write("err 1")'],
    pipe: {
      stdout: process.stdout,
      stderr: process.stderr,
    },
  });
 
  logger.info('sandbox execution finished', { exitCode: execution.exitCode });
} finally {
  await sandbox.destroy();
}

pipe and stream are separate concerns: pipe writes to a local writable stream; stream configures server-side stream IDs for routing output elsewhere.

Environment Variables

Pass environment variables when creating or running a sandbox.

import { logger } from '@agentuity/telemetry';
 
const result = await client.run({
  runtime: 'bun:1',
  command: { exec: ['bun', '-e', 'process.stdout.write(process.env.API_URL ?? "")'] },
  env: {
    API_URL: 'https://api.example.com',
  },
});
 
logger.info('sandbox env output', { stdout: result.stdout });

Update an interactive sandbox with setEnv(). Set a value to null to delete that variable.

import { logger } from '@agentuity/telemetry';
 
const updated = await sandbox.setEnv({
  API_URL: 'https://api.example.com',
  DEBUG: 'true',
});
 
await sandbox.setEnv({
  DEBUG: null,
});
 
logger.info('sandbox env updated', { apiUrl: updated.API_URL });

Code inside the sandbox is a separate process. Pass values explicitly with env, setEnv(), or files.

Exposing Ports

Set network.port to expose one port from the sandbox. Set network.enabled: true too when commands inside the sandbox need outbound network access.

import { Buffer } from 'node:buffer';
import { logger } from '@agentuity/telemetry';
 
const sandbox = await client.create({
  runtime: 'bun:1',
  network: {
    enabled: true,
    port: 3000,
  },
});
 
try {
  await sandbox.writeFiles([
    {
      path: 'server.ts',
      content: Buffer.from(`
        Bun.serve({
          port: 3000,
          fetch: () => new Response('ok'),
        });
      `),
    },
  ]);
 
  await client.createJob(sandbox.id, {
    command: ['bun', 'run', 'server.ts'],
  });
 
  const info = await sandbox.get();
  logger.info('sandbox exposed url', { url: info.url });
} finally {
  await sandbox.destroy();
}

Project Association

Pass projectId when you want sandboxes to show up with a specific project or be filterable later.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const sandbox = await client.create({
  runtime: 'bun:1',
  projectId: 'proj_abc123',
});
 
try {
  const { sandboxes } = await client.list({
    projectId: 'proj_abc123',
    limit: 10,
  });
 
  logger.info('project sandboxes', {
    sandboxIds: sandboxes.map((item) => item.sandboxId),
  });
} finally {
  await sandbox.destroy();
}

Cancelling Commands

Pass an AbortSignal to execute() or as the second argument to run().

import { logger } from '@agentuity/telemetry';
 
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 5_000);
 
try {
  const execution = await sandbox.execute({
    command: ['sh', '-c', 'sleep 60'],
    signal: controller.signal,
  });
 
  logger.info('sandbox execution status', { status: execution.status });
} catch (error) {
  if (error instanceof DOMException && error.name === 'AbortError') {
    logger.info('sandbox execution cancelled');
  } else {
    throw error;
  }
} finally {
  clearTimeout(timeout);
}

Sandbox Management

Use list(), get(), connect(), and destroy() for existing sandboxes.

import { logger } from '@agentuity/telemetry';
 
const { sandboxes, total } = await client.list({
  status: 'idle',
  limit: 10,
  offset: 0,
});
 
logger.info('sandbox page', { total });
 
for (const info of sandboxes) {
  logger.info('sandbox', {
    sandboxId: info.sandboxId,
    status: info.status,
    executions: info.executions,
  });
}

get() returns metadata. connect() returns a full interactive instance with methods like execute() and writeFiles().

import { logger } from '@agentuity/telemetry';
 
const info = await client.get('sbx_abc123');
logger.info('sandbox info', { status: info.status, createdAt: info.createdAt });
 
const existing = await client.connect('sbx_abc123');
const files = await existing.listFiles();
logger.info('sandbox files', { files: files.map((file) => file.path) });
 
await existing.destroy();

Background Jobs

Use jobs for commands that should keep running after the request that started them returns.

import { logger } from '@agentuity/telemetry';
 
const sandbox = await client.create({ runtime: 'bun:1' });
 
try {
  const job = await client.createJob(sandbox.id, {
    command: ['sh', '-c', 'sleep 30 && echo done'],
  });
 
  const current = await job.get();
  logger.info('sandbox job status', { status: current.status });
 
  const { jobs } = await client.listJobs(sandbox.id, 10);
  logger.info('sandbox jobs', { jobIds: jobs.map((item) => item.jobId) });
 
  await job.stop();
} finally {
  await sandbox.destroy();
}

Job statuses are pending, running, completed, failed, or cancelled.

Lifecycle and Failure Checks

A successful SDK call only means the API request completed. For sandbox work, check the lifecycle surface that matches the thing you started.

SurfaceStates to WatchWhat to Do
sandboxcreating, idle, running, paused, stopping, suspended, terminated, failed, deleteduse get() or list() to decide whether the sandbox can still accept work, should be resumed, or has reached a terminal state
executionqueued, running, completed, failed, timeout, cancelledcheck exitCode, stdout/stderr, outputTruncated, and your app-level output contract
jobpending, running, completed, failed, cancelledpoll with job.get() or listJobs(), then stop or destroy the sandbox when the background work is no longer needed

Common failure paths:

SymptomFirst Check
runtime name fails or is unavailablerun client.listRuntimes() and choose a returned name
command never reaches the expected outputinspect execution status, stderr, timeout, and outputTruncated
long-lived sandbox disappearscheck idle and paused timeouts, then read sandbox events
background server starts but the URL is not readypoll the app's health endpoint before sending real traffic
coding-agent CLI exits 0 but did not complete the taskparse the tool's event output and fail on error events
workflow needs repo history, review, or reconnectsuse Coder instead of managing only a sandbox

For coding-agent runtimes, see Coding agents in sandboxes: many tools report useful errors in stdout/stderr events even when the process exit code is not enough.

Pause, Resume, and Checkpoints

Pause a sandbox when you want to preserve its filesystem but stop the runtime. Resume it before issuing a batch of commands.

import { logger } from '@agentuity/telemetry';
 
await sandbox.pause();
await sandbox.resume();
 
const execution = await sandbox.execute({
  command: ['bun', 'run', 'test'],
});
 
logger.info('sandbox execution finished', { autoResumed: execution.autoResumed });

Set timeout.paused when creating a sandbox to limit how long it can remain paused before the platform terminates it. pause() returns a SandboxPauseResult that includes terminatesAt (an ISO 8601 timestamp) when a paused timeout is configured.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
const sandbox = await client.create({
  runtime: 'bun:1',
  timeout: {
    idle: '10m',
    // terminate the sandbox if it stays paused for more than 24 hours
    paused: '24h',
  },
});
 
try {
  const pauseResult = await sandbox.pause();
  // terminatesAt is set when a paused timeout is configured
  if (pauseResult.terminatesAt !== undefined) {
    logger.info('paused sandbox terminates at', {
      terminatesAt: pauseResult.terminatesAt,
    });
  }
} finally {
  await sandbox.destroy();
}

Pass '0s' for timeout.paused to allow the sandbox to remain paused indefinitely. The CLI flag is --paused-timeout on agentuity cloud sandbox create.

Disk checkpoints are named restore points for one sandbox.

import { logger } from '@agentuity/telemetry';
 
const checkpoint = await client.createDiskCheckpoint(sandbox.id, 'before-upgrade');
 
await sandbox.execute({ command: ['bun', 'add', 'typescript'] });
await checkpoint.restore();
 
const checkpoints = await client.listDiskCheckpoints(sandbox.id);
logger.info('sandbox checkpoints', { names: checkpoints.map((item) => item.name) });
 
await checkpoint.delete();

Snapshots

Snapshots are reusable bases for future sandboxes. Create them from a configured sandbox, then create new sandboxes with snapshot.

import { logger } from '@agentuity/telemetry';
 
const sandbox = await client.create({
  runtime: 'bun:1',
  network: { enabled: true },
});
 
try {
  await sandbox.execute({ command: ['bun', 'init', '-y'] });
  await sandbox.execute({ command: ['bun', 'add', 'zod'] });
 
  const snapshot = await client.createSnapshot(sandbox.id, {
    name: 'bun-zod',
    tag: 'bun-zod',
    description: 'Bun project with Zod installed',
  });
 
  logger.info('snapshot created', { snapshotId: snapshot.snapshotId });
} finally {
  await sandbox.destroy();
}
 
const next = await client.create({
  snapshot: 'bun-zod',
  resources: { memory: '512Mi' },
});
 
await next.destroy();

Manage snapshots with the same client:

import { logger } from '@agentuity/telemetry';
 
const { snapshots } = await client.listSnapshots({ limit: 20 });
const snapshot = await client.getSnapshot('snp_xyz789');
 
await client.tagSnapshot(snapshot.snapshotId, 'v1.0');
await client.tagSnapshot(snapshot.snapshotId, null);
await client.deleteSnapshot(snapshot.snapshotId);
 
logger.info('snapshots listed', { count: snapshots.length });

Use getSnapshotLineage() to inspect the parent chain of a snapshot, which is useful when tracing which base environment a snapshot was built from.

import { logger } from '@agentuity/telemetry';
 
const lineage = await client.getSnapshotLineage({ snapshot: 'snp_xyz789' });
 
for (const entry of lineage.lineage) {
  logger.info('snapshot lineage entry', {
    snapshotId: entry.snapshotId,
    name: entry.name,
    createdAt: entry.createdAt,
  });
}

When snapshot is set, do not also set runtime or runtimeId. The snapshot already includes its base runtime.

Events

List lifecycle events when you need to inspect sandbox history. This is the first place to look when a sandbox changed state outside the command path you expected, such as an idle timeout, failed startup, pause, resume, or termination.

import { logger } from '@agentuity/telemetry';
 
const { events } = await client.listEvents(sandbox.id, {
  limit: 50,
  direction: 'asc',
});
 
for (const event of events) {
  logger.info('sandbox event', { type: event.type, createdAt: event.createdAt });
}

Hono

In Hono apps, @agentuity/hono initializes SandboxClient once and exposes it on c.var.sandbox.

import { agentuity } from '@agentuity/hono';
import type { Services } from '@agentuity/hono';
import { Hono } from 'hono';
 
type Variables = Pick<Services, 'sandbox'>;
 
const app = new Hono<{ Variables: Variables }>();
 
app.use('*', agentuity());
 
app.post('/execute', async (c) => {
  const result = await c.var.sandbox.run({
    runtime: 'bun:1',
    command: { exec: ['bun', '-e', 'process.stdout.write("ok")'] },
  });
 
  return c.json(result);
});
 
export default app;

Configuration Reference

SandboxClientOptions

OptionTypeDescription
apiKeystringAPI key. Defaults to AGENTUITY_SDK_KEY, then AGENTUITY_CLI_KEY
urlstringSandbox API URL override
orgIdstringOrganization ID for multi-org operations
loggerLoggerCustom logger

SandboxCreateOptions

OptionTypeDescription
runtimestringRuntime name, such as 'bun:1' or 'python:3.14'
runtimeIdstringRuntime ID, such as 'srt_xxx'
namestringOptional sandbox name
descriptionstringOptional sandbox description
resources.memorystringMemory limit, such as '256Mi' or '1Gi'
resources.cpustringCPU limit in millicores, such as '500m' or '1000m'
resources.diskstringDisk limit, such as '512Mi' or '2Gi'
network.enabledbooleanEnables outbound network access
network.portnumberPort to expose, 1024-65535
projectIdstringProject ID to associate with the sandbox
timeout.idlestringIdle timeout before cleanup, such as '10m'
timeout.executionstringMax command duration, such as '30s'
timeout.pausedstringMax duration a sandbox can remain paused before termination, such as '24h'; '0s' for infinite
dependenciesstring[]Apt packages to install
packagesstring[]npm or Bun packages to install globally
envRecord<string, string>Environment variables
filesFileToWrite[]Files to write on creation
snapshotstringSnapshot ID or tag
metadataRecord<string, unknown>User-defined metadata
scopesstring[]Permission scopes for automatic service access

Pass scopes when the sandbox needs automatic credentials for platform services.

const sandbox = await client.create({
  runtime: 'bun:1',
  scopes: ['services:read', 'services:write'],
});

ExecuteOptions

OptionTypeDescription
commandstring[]Command and arguments
filesFileToWrite[]Files to write before execution
timeoutstringExecution timeout override
streamobjectOptional stdout, stderr, and timestamp stream configuration
pipe{ stdout?: Writable; stderr?: Writable }Pipe command output directly to Node writable streams
signalAbortSignalCancels the execution

SandboxInstance

Property or methodType
idstring
statusSandboxStatus
execute(options)Promise<Execution>
writeFiles(files)Promise<number>
readFile(path)Promise<ReadableStream<Uint8Array>>
listFiles(path?)Promise<SandboxFileInfo[]>
mkDir(path, recursive?)Promise<void>
rmFile(path)Promise<{ found: boolean }>
rmDir(path, recursive?)Promise<{ found: boolean }>
setEnv(env)Promise<Record<string, string>>
get()Promise<SandboxInfo>
pause()Promise<SandboxPauseResult>
resume()Promise<void>
destroy()Promise<void>

Execution

FieldTypeDescription
executionIdstringUnique execution ID
status'queued' | 'running' | 'completed' | 'failed' | 'timeout' | 'cancelled'Execution status
exitCodenumber | undefinedProcess exit code, when available
durationMsnumber | undefinedExecution duration in milliseconds
stdoutStreamUrlstring | undefinedURL for stdout
stderrStreamUrlstring | undefinedURL for stderr
outputTruncatedboolean | undefinedWhether captured output was truncated
autoResumedboolean | undefinedWhether the sandbox was resumed automatically

SandboxRunResult

FieldTypeDescription
sandboxIdstringSandbox ID used for the run
exitCodenumberProcess exit code
durationMsnumberExecution duration in milliseconds
stdoutstring | undefinedCaptured stdout
stderrstring | undefinedCaptured stderr

Snapshot Methods

MethodReturns
createSnapshot(sandboxId, options?)Promise<SnapshotInfo>
getSnapshot(snapshotId)Promise<SnapshotInfo>
listSnapshots(params?)Promise<SnapshotListResponse>
tagSnapshot(snapshotId, tag)Promise<SnapshotInfo>
deleteSnapshot(snapshotId)Promise<void>
getSnapshotLineage(params?)Promise<SnapshotLineageResponse>

Best Practices

  • Use run() for single commands so cleanup happens automatically.
  • List runtimes before hard-coding a runtime name in app configuration.
  • Destroy interactive sandboxes in a finally block when the workflow is finished.
  • Wait for one execute() result before starting the next command unless you intentionally started a background job.
  • Prefer snapshots when every run needs the same dependencies or files.
  • Set explicit resources and timeout values for untrusted or user-provided code.
  • Keep outbound networking disabled unless the command needs package installs or external APIs.
  • Treat stdout/stderr as data to parse, not proof of success by itself.
  • Use projectId when you need to filter sandboxes later; use metadata for context returned on sandbox detail.

Next Steps