Coding agents in sandboxes

Run coding-agent runtimes in isolated sandboxes when you need direct control over the tool process.

Use a coding-agent sandbox when your app needs to run a specific agent CLI or server inside an isolated workspace that you control. Start with Coder when you want Agentuity to manage the session lifecycle, repo context, event history, reconnects, and human review.

npm install @agentuity/sandbox @agentuity/telemetry

Choose a Path

PathUse It WhenWhat You Manage
SandboxClient.run()One bounded command, such as checking a runtime, validating generated code, or running a short CLI taskcommand, files, timeout, output parsing
SandboxClient.create()Multi-step work with files, network, background jobs, snapshots, or a long-lived agent serverlifecycle, cleanup, readiness checks, credentials
Coder sessionRepo-aware coding work that needs session history, skills, output contracts, reconnects, or human reviewtask prompt, workspace/repo choice, expected output paths
Coding-agent runtimeYou need direct process control over a specific agent CLI or server imageprovider config, permissions, agent events, failure detection

A runtime being present only proves the tool image is available. Model access, provider credentials, network policy, and tool permissions are separate prerequisites.

Verify a Runtime in the Catalog

List runtimes before hard-coding one into a workflow. Runtime names are catalog data for your org, not a TypeScript enum or a guaranteed product roster.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const runtimeName = 'opencode:latest';
const client = new SandboxClient();
const { runtimes } = await client.listRuntimes({ limit: 50 });
const runtime = runtimes.find((item) => item.name === runtimeName);
 
if (!runtime) {
  throw new Error(`${runtimeName} is not available in this org's runtime catalog.`);
}
 
logger.info('runtime available', {
  name: runtime.name,
  tags: runtime.tags,
  requirements: runtime.requirements,
});

Expected output excerpt:

{
  "name": "opencode:latest"
}

Use this catalog check for the runtime you intend to run. The examples below use opencode:latest only after the catalog returns it. If your org does not return that runtime, stop and pick a listed runtime or ask an org admin whether that image should be enabled.

Verify OpenCode Is Installed

Use SandboxClient.run() for a fast runtime check before you build a longer workflow around a coding-agent image.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'opencode:latest',
  command: {
    exec: ['sh', '-lc', 'pwd && which opencode && opencode --version'],
  },
  timeout: { execution: '30s' },
});
 
logger.info('opencode runtime check', {
  exitCode: result.exitCode,
  stdout: result.stdout,
  stderr: result.stderr,
});
 
if (result.exitCode !== 0) {
  process.exitCode = 1;
}

Expected output excerpt:

<working-directory>
<path-to-opencode>
<version>

Do not assume the writable directory or binary path across runtimes. Use pwd, which, and relative paths unless the runtime catalog or your own validation proves a fixed path.

Run Agent CLIs with Explicit Failure Checks

Coding-agent CLIs can emit structured error events even when the process exits successfully. When you run a headless agent command, parse the event stream and fail the workflow if any event reports an error.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
type AgentEvent = {
  type: string;
  error: unknown | undefined;
};
 
function parseAgentEvent(line: string): AgentEvent | undefined {
  try {
    const parsed: unknown = JSON.parse(line);
    if (
      typeof parsed === 'object' &&
      parsed !== null &&
      'type' in parsed &&
      typeof parsed.type === 'string'
    ) {
      return {
        type: parsed.type,
        error: 'error' in parsed ? parsed.error : undefined,
      };
    }
  } catch {
    return undefined;
  }
 
  return undefined;
}
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'opencode:latest',
  command: {
    exec: [
      'sh',
      '-lc',
      [
        'mkdir -p project',
        'cd project',
        'opencode run --format json "Inspect this empty project and return one sentence."',
      ].join(' && '),
    ],
  },
  timeout: { execution: '2m' },
});
 
const stdout = result.stdout ?? '';
 
const events = stdout
  .split('\n')
  .map((line) => line.trim())
  .filter((line) => line.startsWith('{'))
  .flatMap((line) => {
    const event = parseAgentEvent(line);
    return event ? [event] : [];
  });
 
const errorEvent = events.find((event) => event.type === 'error');
if (errorEvent) {
  logger.error('coding agent reported an error', { errorEvent });
  process.exitCode = 1;
}
 
logger.info('coding agent command finished', {
  exitCode: result.exitCode,
  eventCount: events.length,
});

Long-Lived Agent Servers

Use SandboxClient.create() when the coding agent exposes a server or needs multiple requests against the same workspace.

The shape is:

  1. Create a sandbox with the agent runtime.
  2. Enable network access and expose the server port.
  3. Start the agent server as a background job or long-running command.
  4. Poll the health endpoint before sending prompts.
  5. Store the sandbox ID and server URL in your app state.
  6. Destroy the sandbox when the session ends, or pause it when you want to resume later.

This pattern fits assistant-style workflows where your app keeps a conversation open against one workspace. Use Coder instead when you want Agentuity to manage the shared session, event history, reconnects, and review surface.

If this fails

SymptomLikely CauseWhat to Check
Runtime name is missingThe org runtime catalog does not include that imageRun client.listRuntimes({ limit: 50 }) and choose a returned name
mkdir /workspace failsThe runtime does not make /workspace writableUse a relative directory or ~/project
CLI exits 0 but no answer appearsFirst-run setup or agent event output did not include a final answerInspect stdout/stderr and parse JSON events
JSON event has type: "error"Provider, network, auth, or tool permission failureLog the event payload and fix the provider/runtime config
Workflow needs repo history, review, or reconnectPlain sandbox state is the wrong abstractionStart a Coder session with a repo/workspace and output contract

See Also