Agent Streaming
How to use streaming in your agents
Streaming lets your users read the response before the AI finishes thinking. Nothing feels faster than already happening.
Why Streaming?
- Latency hiding by showing results instantly instead of after the whole response is ready.
- Large inputs and outputs without hitting payload limits.
- Agent chains can forward chunks to the next agent as soon as they arrive.
- Snappier UX so users see progress in milliseconds instead of waiting for the full payload.
- Resource efficiency by not holding entire responses in memory; chunks flow straight through.
- Composable pipelines by allowing agents, functions, and external services to hand off work in a continuous stream.
A simple visualization of the difference between traditional request/response and streaming:
┌─────────────────────────── traditional request/response ───────────────────────────────────┐
| client waiting ... ██████████████████████████████████████████ full payload display |
└────────────────────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────── streaming request/response ─────────────────────────────────────┐
| c l i e n t r e a d s c h u n k 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 … |
└────────────────────────────────────────────────────────────────────────────────────────────┘
Real-World Use Cases
- Live chat / customer support. Stream the assistant's words as they are generated for a more natural feel.
- Speech-to-text. Pipe microphone audio into a transcription agent and forward captions to the UI in real time.
- Streaming search results. Show the first relevant hits immediately while the rest are still processing.
- Agent chains. One agent can translate, the next can summarize, the third can analyze – all in a single flowing stream.
How Streaming Works in Agentuity
- Outbound:
resp.stream(source)
– wheresource
can be:- An async iterator (e.g. OpenAI SDK stream)
- A ReadableStream
- Another agent's stream
- Inbound:
await request.data.stream()
– consume the client's incoming stream. - Under the hood Agentuity handles the details of the streaming input and output for you.
OpenAI Streaming Example
In this example, we use the OpenAI SDK to stream the response from the OpenAI API back to the caller.
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
const { textStream } = streamText({
model: openai("gpt-4o"),
prompt: "Invent a new holiday and describe its traditions.",
});
return resp.stream(textStream);
}
from openai import OpenAI
from agentuity import AgentRequest, AgentResponse, AgentContext
client = OpenAI()
async def run(request: AgentRequest, response: AgentResponse, context: AgentContext):
chat_completion = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a friendly assistant!"},
{"role": "user", "content": request.data.text or "Why is the sky blue?"},
],
model="gpt-4o",
stream=True,
)
return response.stream(chat_completion, lambda chunk: chunk.choices[0].delta.content)
Structured Object Streaming with Vercel AI SDK
The stream method now supports transformer functions that can filter and transform stream items. This is particularly useful when working with structured data from AI SDKs like Vercel AI SDK's streamObject
.
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
import { openai } from '@ai-sdk/openai';
import { streamObject } from 'ai';
import { z } from 'zod';
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
const { elementStream } = streamObject({
model: openai('gpt-4o'),
output: 'array',
schema: z.object({
name: z.string(),
class: z
.string()
.describe('Character class, e.g. warrior, mage, or thief.'),
description: z.string(),
}),
prompt: 'Generate 3 hero descriptions for a fantasy role playing game.',
});
return resp.stream(elementStream);
}
The SDK automatically detects object streams and converts them to JSON newline format with the appropriate application/json
content type.
Stream Transformers
You can provide transformer functions to filter and transform stream data:
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
// Get stream from another source
const dataStream = getDataStream();
// Transform and filter items
const transformer = (item: any) => {
// Filter out items (return null/undefined to skip)
if (!item.active) return null;
// Transform the item
return {
id: item.id,
name: item.name.toUpperCase(),
timestamp: Date.now()
};
};
return resp.stream(dataStream, undefined, {}, transformer);
}
You can also use generator functions for more complex transformations:
// Generator transformer that can yield multiple items or filter
function* transformer(item: any) {
if (item.type === 'batch') {
// Yield multiple items from a batch
for (const subItem of item.items) {
yield { ...subItem, processed: true };
}
} else if (item.valid) {
// Yield single transformed item
yield { ...item, enhanced: true };
}
// Return nothing to filter out invalid items
}
Agent-to-Agent Streaming
In this example, we use the Agentuity SDK to stream the response from one agent to another.
import type { AgentRequest, AgentResponse, AgentContext } from "@agentuity/sdk";
export default async function Agent(
req: AgentRequest,
resp: AgentResponse,
ctx: AgentContext,
) {
// [1] Call another agent
const expert = await ctx.getAgent({ name: "HistoryExpert" });
const expertResp = await expert.run({ prompt: "What engine did a P-51D Mustang use?" });
// [2] Grab its stream
const stream = await expertResp.data.stream();
// [3] Pipe straight through
return resp.stream(stream);
}
Chain as many agents as you like; each one can inspect, transform, or just relay the chunks.
Further Reading
- Blog Post: Agents just want to have streams
- SDK Examples: JavaScript · Python
- Streaming Video Demo: Watch on YouTube
Need Help?
Join our Community for assistance or just to hang with other humans building agents.
Send us an email at hi@agentuity.com if you'd like to get in touch.
Please Follow us on
If you haven't already, please Signup for your free account now and start building your first agent!