Agentuity Documentation

Use Sandbox when your app owns the command and wants a bounded result: write files, run an external coding tool, inspect stdout/stderr, and let the one-shot sandbox disappear after the command exits.

Use Coder instead when the work should be a managed session with Hub history, replay, skills, agents, reconnects, and human review.

npm install @agentuity/sandbox @agentuity/telemetry

The Pattern

This example runs Biome as an external coding tool. It writes a tiny TypeScript project into a one-shot sandbox, lets Biome rewrite the source, runs bun test, and prints a diff plus final source between result markers.

The script has three moving pieces:

command.files seeds the workspace before the command starts.
the shell command runs the tool and verification command.
result markers make the useful part of stdout easy for your app to extract.

typescriptscripts/sandbox-biome.ts

import { Buffer } from 'node:buffer';
import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const BIOME_VERSION = '2.4.16';
 
const command = [
  'set -eu',
  'printf "__WORKSPACE__\\n"',
  'pwd',
  'printf "__FILES_BEFORE__\\n"',
  'find . -maxdepth 2 -type f | sort',
  'cp src/math.ts /tmp/math.before.ts',
  `bunx --bun @biomejs/biome@${BIOME_VERSION} check --write src/math.ts src/math.test.ts`,
  'bun test',
  'printf "\\n__AGENTUITY_RESULT_START__\\n"',
  'printf "diff src/math.ts before/after:\\n"',
  'if diff -u /tmp/math.before.ts src/math.ts; then printf "no source changes\\n"; fi',
  'printf "\\nfinal src/math.ts:\\n"',
  'sed -n "1,120p" src/math.ts',
  'printf "\\n__AGENTUITY_RESULT_END__\\n"',
].join('\n');
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'bun:1',
  network: { enabled: true },
  timeout: { execution: '5m' },
  command: {
    exec: ['sh', '-lc', command],
    files: [
      {
        path: 'package.json',
        content: Buffer.from(
          JSON.stringify(
            {
              type: 'module',
              scripts: { test: 'bun test' },
              devDependencies: { '@types/bun': 'latest' },
            },
            null,
            2
          )
        ),
      },
      {
        path: 'biome.json',
        content: Buffer.from(
          JSON.stringify(
            {
              formatter: { indentStyle: 'tab' },
              javascript: {
                formatter: {
                  quoteStyle: 'single',
                  semicolons: 'always',
                },
              },
            },
            null,
            2
          )
        ),
      },
      {
        path: 'src/math.ts',
        content: Buffer.from(
          [
            'export function total(values: readonly number[]): number{',
            'return values.reduce((sum,value)=>sum+value,0)',
            '}',
            '',
          ].join('\n')
        ),
      },
      {
        path: 'src/math.test.ts',
        content: Buffer.from(
          [
            "import { expect, test } from 'bun:test';",
            "import { total } from './math.ts';",
            '',
            "test('totals values', () => {",
            '  expect(total([2, 3, 5])).toBe(10);',
            '});',
            '',
          ].join('\n')
        ),
      },
    ],
  },
});
 
const hasResultMarkers = result.stdout?.includes('__AGENTUITY_RESULT_START__') === true;
 
logger.info('sandbox coding tool result', {
  sandboxId: result.sandboxId,
  exitCode: result.exitCode,
  hasResultMarkers,
  stdout: result.stdout,
  stderr: result.stderr,
});
 
if (result.exitCode !== 0 || !hasResultMarkers) {
  throw new Error('Sandbox coding-tool workflow did not produce the expected result markers.');
}

Expected output excerpt. Exact tool timings and runtime versions can vary.

Checked 2 files in 6ms. Fixed 2 files.
bun test v1.3.14
 
src/math.test.ts:
(pass) totals values
 
 1 pass
 0 fail
 
__AGENTUITY_RESULT_START__
diff src/math.ts before/after:
--- /tmp/math.before.ts
+++ src/math.ts
@@ -1,3 +1,3 @@
-export function total(values: readonly number[]): number{
-return values.reduce((sum,value)=>sum+value,0)
+export function total(values: readonly number[]): number {
+	return values.reduce((sum, value) => sum + value, 0);
 }
__AGENTUITY_RESULT_END__

client.run() creates a one-shot sandbox, captures stdout/stderr, and destroys the sandbox after the command exits. Keep the sandboxId for logs and debugging, not for follow-up file reads.

Check a Coding-Agent Runtime First

If the external tool is a coding-agent CLI, validate the runtime and binary before sending real work. Runtime availability proves the image exists. Provider credentials and model access are separate checks.

import { SandboxClient } from '@agentuity/sandbox';
import { logger } from '@agentuity/telemetry';
 
const client = new SandboxClient();
 
const result = await client.run({
  runtime: 'opencode:latest',
  network: { enabled: true },
  timeout: { execution: '2m' },
  command: {
    exec: [
      'sh',
      '-lc',
      [
        'pwd',
        'which opencode',
        'opencode --version',
        'opencode run --help | sed -n "1,80p"',
      ].join(' && '),
    ],
  },
});
 
logger.info('opencode runtime check', {
  exitCode: result.exitCode,
  stdout: result.stdout,
  stderr: result.stderr,
});

When you run a headless coding-agent command, parse its event output and fail on structured error events. A process exit code alone is not always enough to prove success.

When to switch to `create()`

Use client.create() instead of client.run() when the workflow needs:

multiple commands against the same filesystem
a background server or agent daemon
file reads after the command finishes
pause/resume or checkpoints
snapshots for reuse

Wrap interactive sandboxes in try/finally and call sandbox.destroy() when the workflow is done.

Key Points

Use command.files for the files a one-shot command needs before it starts.
Set network.enabled: true when the tool needs to download packages or call a provider.
Print explicit result markers when stdout contains installer logs or tool chatter.
Check exitCode, stdout/stderr, and the app-level contract you asked the tool to satisfy.
Use Coder when the task needs session state, replay, skills, or human review.

Run External Coding Tools in Sandbox

The Pattern

Check a Coding-Agent Runtime First

When to switch to `create()`

Key Points

See Also

The Pattern

Check a Coding-Agent Runtime First

When to switch to create()

Key Points

See Also

When to switch to `create()`