Using the Sandbox API — Agentuity Documentation

Using the Sandbox API

Programmatic API for creating and managing sandboxes

Access sandbox functionality through ctx.sandbox in agents or c.var.sandbox in routes. Choose between one-shot execution for single commands or interactive sandboxes for multi-step workflows.

One-shot Execution

Use sandbox.run() when you need to execute a single command. The sandbox is automatically created and destroyed.

import { createAgent } from '@agentuity/runtime';
import { z } from 'zod';
 
const agent = createAgent('CodeRunner', {
  schema: {
    input: z.object({ code: z.string() }),
    output: z.object({
      success: z.boolean(),
      output: z.string(),
      exitCode: z.number(),
    }),
  },
  handler: async (ctx, input) => {
    const result = await ctx.sandbox.run({
      command: {
        exec: ['python3', '-c', input.code],
      },
      resources: { memory: '256Mi', cpu: '500m' },
      timeout: { execution: '30s' },
    });
 
    return {
      success: result.exitCode === 0,
      output: result.stdout || result.stderr || '',
      exitCode: result.exitCode,
    };
  },
});

With File Input

Write files to the sandbox before execution:

const result = await ctx.sandbox.run({
  command: {
    exec: ['bun', 'run', 'index.ts'],
    files: [
      {
        path: 'index.ts',
        content: Buffer.from('console.log("Hello from TypeScript!")'),
      },
      {
        path: 'data.json',
        content: Buffer.from(JSON.stringify({ items: [1, 2, 3] })),
      },
    ],
  },
  resources: { memory: '512Mi' },
});

Interactive Sandbox

Use sandbox.create() for multi-step workflows. The sandbox persists until you explicitly destroy it.

import { createAgent } from '@agentuity/runtime';
 
const agent = createAgent('ProjectBuilder', {
  handler: async (ctx, input) => {
    // Create a persistent sandbox
    const sandbox = await ctx.sandbox.create({
      resources: { memory: '1Gi', cpu: '1000m' },
      network: { enabled: true },  // Allow package downloads
      dependencies: ['git'],       // Pre-install apt packages
    });
 
    try {
      // Run multiple commands in sequence
      await sandbox.execute({ command: ['npm', 'init', '-y'] });
      await sandbox.execute({ command: ['npm', 'install', 'zod'] });
 
      // Write project files
      await sandbox.writeFiles([
        {
          path: 'index.ts',
          content: Buffer.from(`
            import { z } from 'zod';
            const schema = z.object({ name: z.string() });
            console.log(schema.parse({ name: 'test' }));
          `),
        },
      ]);
 
      // Build and run
      const result = await sandbox.execute({
        command: ['npx', 'tsx', 'index.ts'],
      });
 
      return { output: result.stdoutStreamUrl };
    } finally {
      // Always clean up
      await sandbox.destroy();
    }
  },
});

Writing Files

Write files to the sandbox workspace before or during execution:

await sandbox.writeFiles([
  { path: 'src/main.py', content: Buffer.from('print("Hello")') },
  { path: 'config.json', content: Buffer.from('{"debug": true}') },
]);

Exposing Ports

Expose a port from the sandbox to make it accessible via a public URL:

const sandbox = await ctx.sandbox.create({
  network: {
    enabled: true,
    port: 3000,  // Expose port 3000 (valid range: 1024-65535)
  },
  resources: { memory: '512Mi' },
});
 
// Start a web server inside the sandbox
await sandbox.execute({ command: ['npm', 'run', 'serve'] });
 
// Get the public URL
const info = await ctx.sandbox.get(sandbox.id);
if (info.url) {
  ctx.logger.info('Server accessible at', { url: info.url, port: info.networkPort });
}

Project Association

Associate sandboxes with a project for organization and filtering:

const sandbox = await ctx.sandbox.create({
  projectId: 'proj_abc123',  // Associate with project
  resources: { memory: '512Mi' },
});
 
// List sandboxes by project
const { sandboxes } = await ctx.sandbox.list({
  projectId: 'proj_abc123',
});

Reading Files

Read files from the sandbox as streams:

const stream = await sandbox.readFile('output/results.json');
const reader = stream.getReader();
const chunks: Uint8Array[] = [];
 
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  chunks.push(value);
}
 
const content = new TextDecoder().decode(Buffer.concat(chunks));
const data = JSON.parse(content);

Streaming Output

Access stdout and stderr as streams for real-time output:

const sandbox = await ctx.sandbox.create();
 
// Start a long-running command
await sandbox.execute({ command: ['npm', 'run', 'build'] });
 
// Stream stdout with error handling
try {
  const reader = sandbox.stdout.getReader();
  const decoder = new TextDecoder();
 
  while (true) {
    const { done, value } = await reader.read();
    if (done) break;
    ctx.logger.info('Build output', { line: decoder.decode(value) });
  }
} catch (error) {
  ctx.logger.error('Failed to read stream', { error });
}

Creating from Snapshot

Start sandboxes from pre-configured snapshots for faster cold starts:

const sandbox = await ctx.sandbox.create({
  snapshot: 'node-project-base',  // Use tag or snapshot ID
  resources: { memory: '512Mi' },
});
 
// Sandbox already has node_modules and dependencies installed
await sandbox.execute({ command: ['npm', 'run', 'build'] });

Environment Variables

Pass environment variables to sandboxes:

const result = await ctx.sandbox.run({
  command: { exec: ['node', '-e', 'console.log(process.env.API_KEY)'] },
  env: {
    API_KEY: 'secret-key',
    NODE_ENV: 'test',
    DEBUG: 'true',
  },
  resources: { memory: '256Mi' },
});

Automatic Environment Variables

Every sandbox automatically receives environment variables that provide context about its runtime environment, letting your code access the sandbox ID, public URL, and more:

VariableDescription
AGENTUITY_SANDBOX_IDThe sandbox's unique identifier
AGENTUITY_SANDBOX_RUNTIMERuntime name (e.g., python, bun)
AGENTUITY_SANDBOX_RUNTIME_IDRuntime unique identifier
AGENTUITY_SANDBOX_ORG_IDOrganization ID
AGENTUITY_SANDBOX_PROJECT_IDProject ID (only when project context exists)
AGENTUITY_SANDBOX_URLPublic URL (only when network is enabled)
const result = await ctx.sandbox.run({
  command: {
    exec: ['node', '-e', `
      console.log('Sandbox:', process.env.AGENTUITY_SANDBOX_ID);
      console.log('Runtime:', process.env.AGENTUITY_SANDBOX_RUNTIME);
      if (process.env.AGENTUITY_SANDBOX_URL) {
        console.log('URL:', process.env.AGENTUITY_SANDBOX_URL);
      }
    `],
  },
  network: { enabled: true },
  resources: { memory: '256Mi' },
});

Note that the code snippet above runs inside an isolated sandbox, not in the agent, so use standard logging (console.log, print, etc.) rather than ctx.logger.

Cancelling Execution

Use an AbortSignal to cancel long-running commands:

const controller = new AbortController();
 
// Set a 5-second timeout
setTimeout(() => controller.abort(), 5000);
 
try {
  const result = await sandbox.execute({
    command: ['npm', 'run', 'long-task'],
    signal: controller.signal,
  });
  return { output: result.stdout };
} catch (error) {
  if (error.name === 'AbortError') {
    ctx.logger.warn('Execution cancelled');
    return { output: '', cancelled: true };
  }
  throw error;
}

Using in Routes

Routes access sandbox through c.var.sandbox:

import { createRouter } from '@agentuity/runtime';
 
const router = createRouter();
 
router.post('/execute', async (c) => {
  const { language, code } = await c.req.json();
 
  const commands: Record<string, string[]> = {
    python: ['python3', '-c', code],
    javascript: ['node', '-e', code],
    typescript: ['bun', '-e', code],
  };
 
  if (!commands[language]) {
    return c.json({ error: 'Unsupported language' }, 400);
  }
 
  const result = await c.var.sandbox.run({
    command: { exec: commands[language] },
    timeout: { execution: '10s' },
    resources: { memory: '128Mi', cpu: '250m' },
  });
 
  return c.json({
    success: result.exitCode === 0,
    output: result.stdout || result.stderr,
    exitCode: result.exitCode,
    durationMs: result.durationMs,
  });
});
 
export default router;

Sandbox Management

Listing Sandboxes

const { sandboxes, total } = await ctx.sandbox.list({
  status: 'idle',      // Filter by status
  projectId: 'proj_x', // Filter by project
  snapshotId: 'snp_y', // Filter by snapshot
  limit: 10,
  offset: 0,
});
 
for (const info of sandboxes) {
  ctx.logger.info('Sandbox', {
    id: info.sandboxId,
    status: info.status,
    executions: info.executions,
  });
}

Getting Sandbox Info

const info = await ctx.sandbox.get('sbx_abc123');
ctx.logger.info('Sandbox details', {
  status: info.status,
  createdAt: info.createdAt,
  snapshotId: info.snapshotId,
});

Destroying Sandboxes

// Via sandbox instance
await sandbox.destroy();
 
// Via service (by ID)
await ctx.sandbox.destroy('sbx_abc123');

Configuration Reference

SandboxCreateOptions

OptionTypeDescription
runtimestringRuntime environment: 'bun:1', 'python:3.14'
namestringOptional sandbox name
descriptionstringOptional description
resources.memorystringMemory limit: '256Mi', '1Gi'
resources.cpustringCPU in millicores: '500m', '1000m'
resources.diskstringDisk limit: '512Mi', '2Gi'
network.enabledbooleanEnable outbound network (default: false)
network.portnumberPort to expose to internet (1024-65535)
projectIdstringAssociate sandbox with a project
timeout.idlestringAuto-destroy after idle: '10m', '1h'
timeout.executionstringMax command duration: '30s', '5m'
dependenciesstring[]Apt packages: ['python3', 'git']
snapshotstringSnapshot ID or tag to restore from
envRecord<string, string>Environment variables
metadataRecord<string, unknown>User-defined metadata for tracking

ExecuteOptions

OptionTypeDescription
commandstring[]Command and arguments
filesFileToWrite[]Files to create before execution
timeoutstringOverride execution timeout
signalAbortSignalCancel the execution

Execution

Returned by sandbox.execute():

FieldTypeDescription
executionIdstringUnique execution ID for debugging
statusstring'queued', 'running', 'completed', 'failed', 'timeout', 'cancelled'
exitCodenumberProcess exit code (when completed)
durationMsnumberExecution duration in milliseconds
stdoutStreamUrlstringURL to fetch stdout stream
stderrStreamUrlstringURL to fetch stderr stream
cpuTimeMsnumberCPU time consumed in milliseconds
memoryByteSecnumberMemory usage in byte-seconds
networkEgressBytesnumberOutbound network traffic in bytes

SandboxRunResult

FieldTypeDescription
sandboxIdstringSandbox ID (for debugging)
exitCodenumberProcess exit code
durationMsnumberExecution duration
stdoutstringCaptured stdout (if available)
stderrstringCaptured stderr (if available)
cpuTimeMsnumberCPU time consumed in milliseconds
memoryByteSecnumberMemory usage in byte-seconds
networkEgressBytesnumberOutbound network traffic in bytes

SandboxInfo

Returned by ctx.sandbox.get() and in list results:

FieldTypeDescription
sandboxIdstringUnique sandbox identifier
statusSandboxStatus'creating', 'idle', 'running', 'terminated', 'failed', 'deleted'
createdAtstringISO timestamp
snapshotIdstringSource snapshot (if created from snapshot)
networkPortnumberPort exposed from sandbox (if configured)
urlstringPublic URL (when port is configured)
userSandboxUserInfoUser who created the sandbox
agentSandboxAgentInfoAgent that created the sandbox
projectSandboxProjectInfoAssociated project
orgSandboxOrgInfoOrganization (always present)

Access context information from sandbox info:

const info = await ctx.sandbox.get('sbx_abc123');
 
// Organization is always present
ctx.logger.info('Organization', { id: info.org.id, name: info.org.name });
 
// User info (when created by a user)
if (info.user) {
  ctx.logger.info('Created by', {
    userId: info.user.id,
    name: `${info.user.firstName} ${info.user.lastName}`,
  });
}
 
// Agent info (when created by an agent)
if (info.agent) {
  ctx.logger.info('Agent', { id: info.agent.id, name: info.agent.name });
}
 
// Project info (when associated with a project)
if (info.project) {
  ctx.logger.info('Project', { id: info.project.id, name: info.project.name });
}

Snapshot Management API

Manage snapshots programmatically through ctx.sandbox.snapshot. This lets agents create, list, and manage snapshots without CLI access.

Creating Snapshots

Save the current state of a sandbox as a snapshot:

const sandbox = await ctx.sandbox.create({
  network: { enabled: true },
  resources: { memory: '1Gi' },
});
 
// Set up the environment
await sandbox.execute({ command: ['npm', 'init', '-y'] });
await sandbox.execute({ command: ['npm', 'install', 'typescript', 'zod'] });
 
// Save as snapshot
const snapshot = await ctx.sandbox.snapshot.create(sandbox.id, {
  name: 'typescript-zod-env',
  description: 'TypeScript environment with Zod validation',
  tag: 'latest',
  public: false,  // Keep private to your org
});
 
ctx.logger.info('Snapshot created', {
  snapshotId: snapshot.snapshotId,
  sizeBytes: snapshot.sizeBytes,
  fileCount: snapshot.fileCount,
});
 
await sandbox.destroy();

Listing Snapshots

const { snapshots, total } = await ctx.sandbox.snapshot.list({
  sandboxId: 'sbx_abc123',  // Filter by source sandbox
  limit: 50,
  offset: 0,
});
 
for (const snap of snapshots) {
  ctx.logger.info('Snapshot', {
    id: snap.snapshotId,
    name: snap.name,
    tag: snap.tag,
    createdAt: snap.createdAt,
  });
}

Getting Snapshot Details

const snapshot = await ctx.sandbox.snapshot.get('snp_xyz789');
 
ctx.logger.info('Snapshot details', {
  name: snapshot.name,
  sizeBytes: snapshot.sizeBytes,
  fileCount: snapshot.fileCount,
  files: snapshot.files,  // Array of file info
});

Tagging Snapshots

Update or remove a snapshot's tag:

// Add or update tag
await ctx.sandbox.snapshot.tag('snp_xyz789', 'v1.0');
 
// Point "latest" to a new snapshot
await ctx.sandbox.snapshot.tag('snp_newversion', 'latest');
 
// Remove tag
await ctx.sandbox.snapshot.tag('snp_xyz789', null);

Deleting Snapshots

await ctx.sandbox.snapshot.delete('snp_xyz789');

SnapshotCreateOptions

OptionTypeDescription
namestringDisplay name (URL-safe: letters, numbers, underscores, dashes)
descriptionstringDescription of the snapshot
tagstringTag name (defaults to "latest")
publicbooleanMake snapshot publicly accessible (default: false)

SnapshotInfo

Returned by create, get, and list operations:

FieldTypeDescription
snapshotIdstringUnique identifier
namestringDisplay name
tagstring | nullCurrent tag
sizeBytesnumberTotal size in bytes
fileCountnumberNumber of files
filesSnapshotFileInfo[]File details (path, size, mime type)
createdAtstringISO timestamp
downloadUrlstringURL to download snapshot archive
publicbooleanWhether publicly accessible

Best Practices

  • Set resource limits: Control memory and CPU usage for predictable performance
  • Use timeouts: Always set execution timeouts for untrusted code
  • Enable network when needed: Required for package installation, API calls, and external requests
  • Clean up interactive sandboxes: Use try/finally to ensure destroy() is called
  • Use snapshots for common environments: Pre-install dependencies to reduce cold start time
  • Tag important snapshots: Use semantic versioning tags (v1.0, latest) for reproducibility

Next Steps

  • Snapshots: CLI commands and declarative snapshot definitions
  • CLI Commands: Debug sandboxes from the terminal