This tutorial walks through building a RAG (Retrieval-Augmented Generation) agent that answers questions using your own knowledge base. The agent indexes documents into vector storage, retrieves the closest matches, and cites the sources it used.
What You'll Build
A question-answering agent that:
- Searches a vector database for relevant content
- Uses retrieved documents as context for the LLM
- Returns answers with source citations
- Handles cases where no relevant information is found
Prerequisites
- An Agentuity project (Quickstart if you need one)
- Basic familiarity with Vector Storage
Project Structure
src/agent/knowledge/
└── agent.ts # RAG agent logic
src/agent/indexer/
└── agent.ts # Document indexing logic
src/api/
└── index.ts # Query and indexing endpoints
Create the Agent
When a user asks a question, the agent needs to:
- Search the vector database for relevant documents
- Build context from the search results
- Generate an answer using the LLM with that context
- Return the answer with source citations
import { createAgent } from '@agentuity/runtime';
import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
interface KnowledgeMetadata extends Record<string, unknown> {
readonly title: string;
}
const agent = createAgent('Knowledge Agent', {
description: 'Answers questions using a knowledge base',
schema: {
input: z.object({
question: z.string().describe('The question to answer'),
}),
output: z.object({
answer: z.string(),
sources: z.array(z.object({
id: z.string(),
title: z.string(),
relevance: z.number(),
})),
confidence: z.number().min(0).max(1),
}),
},
handler: async (ctx, input) => {
ctx.logger.info('Searching knowledge base', { question: input.question });
// Search for relevant documents
const results = await ctx.vector.search<KnowledgeMetadata>('knowledge-base', {
query: input.question,
limit: 5,
similarity: 0.7,
});
// Handle no results
if (results.length === 0) {
ctx.logger.info('No relevant documents found');
return {
answer: "I couldn't find relevant information to answer your question.",
sources: [],
confidence: 0,
};
}
const documents = await ctx.vector.getMany<KnowledgeMetadata>(
'knowledge-base',
...results.map((result) => result.key)
);
// Build context from the matched documents
const context = results
.map((r, i) => `[${i + 1}] ${documents.get(r.key)?.document ?? ''}`)
.join('\n\n');
ctx.logger.debug('Built context from documents', {
documentCount: results.length
});
// Generate answer with LLM
const { text } = await generateText({
model: openai('gpt-5-mini'),
system: `You are a helpful assistant that answers questions based on provided context.
Only use information from the context. If the context doesn't contain the answer, say so.
Cite sources using [1], [2], etc. when referencing specific information.`,
prompt: `Context:
${context}
Question: ${input.question}
Answer the question using only the provided context. Cite your sources.`,
});
// Calculate confidence from average similarity
const avgSimilarity = results.reduce((sum, r) => sum + r.similarity, 0) / results.length;
return {
answer: text,
sources: results.map((r, i) => ({
id: r.key,
title: r.metadata?.title ?? `Document ${i + 1}`,
relevance: r.similarity,
})),
confidence: avgSimilarity,
};
},
});
export default agent;Add an Indexing Agent
Before you can query your knowledge base, you need to populate it. A separate indexing agent handles this by:
- Accepting an array of documents
- Storing each document in the vector database with metadata
- Returning the count and IDs of indexed documents
import { createAgent } from '@agentuity/runtime';
import { z } from 'zod';
const DocumentSchema = z.object({
id: z.string(),
title: z.string(),
content: z.string(),
category: z.string().optional(),
});
const agent = createAgent('Document Indexer', {
description: 'Indexes documents into the knowledge base',
schema: {
input: z.object({
documents: z.array(DocumentSchema),
}),
output: z.object({
indexed: z.number(),
ids: z.array(z.string()),
}),
},
handler: async (ctx, input) => {
ctx.logger.info('Indexing documents', { count: input.documents.length });
const ids: string[] = [];
for (const doc of input.documents) {
await ctx.vector.upsert('knowledge-base', {
key: doc.id,
document: doc.content,
metadata: {
title: doc.title,
category: doc.category,
indexedAt: new Date().toISOString(),
},
});
ids.push(doc.id);
}
ctx.logger.info('Indexing complete', { indexed: ids.length });
return {
indexed: ids.length,
ids,
};
},
});
export default agent;Create the Route
The route exposes both agents over HTTP. Use agent.validator() for type-safe validation using each agent's schema.
import { Hono } from 'hono';
import type { Env } from '@agentuity/runtime';
import knowledgeAgent from '@agent/knowledge/agent';
import indexerAgent from '@agent/indexer/agent';
const router = new Hono<Env>();
router.post('/knowledge/index', indexerAgent.validator(), async (c) => {
const data = c.req.valid('json');
const result = await indexerAgent.run(data);
return c.json(result);
});
// Query endpoint - validates using agent's input schema
router.post('/knowledge', knowledgeAgent.validator(), async (c) => {
const { question } = c.req.valid('json');
const result = await knowledgeAgent.run({ question });
return c.json(result);
});
// Health check
router.get('/health', (c) => c.text('OK'));
export default router;Test Your Agent
With both agents created, you can test the full flow: index some documents, then query them.
Start the dev server:
agentuity devThe examples below assume your API router is mounted at /api in app.ts, as shown in the quickstart.
Index some test documents:
curl -X POST http://localhost:3500/api/knowledge/index \
-H "Content-Type: application/json" \
-d '{
"documents": [
{
"id": "doc-1",
"title": "Getting Started",
"content": "Agentuity is a full-stack platform for building AI agents. You can create agents using TypeScript and deploy them with a single command."
},
{
"id": "doc-2",
"title": "Storage Options",
"content": "Agentuity provides three storage options: key-value for simple data, vector for semantic search, and object storage for files."
}
]
}'Query the knowledge base:
curl -X POST http://localhost:3500/api/knowledge \
-H "Content-Type: application/json" \
-d '{"question": "What storage options does Agentuity provide?"}'Expected response:
{
"answer": "Agentuity provides three storage options [2]: key-value storage for simple data, vector storage for semantic search, and object storage for files.",
"sources": [
{ "id": "doc-2", "title": "Storage Options", "relevance": 0.89 }
],
"confidence": 0.89
}Frontend
Build a search interface for your knowledge base:
import { useState } from 'react';
interface Source {
id: string;
title: string;
relevance: number;
}
interface KnowledgeResult {
answer: string;
sources: Source[];
confidence: number;
}
export function App() {
const [question, setQuestion] = useState('');
const [data, setData] = useState<KnowledgeResult | null>(null);
const [error, setError] = useState<string | null>(null);
const [isLoading, setIsLoading] = useState(false);
const handleSearch = async () => {
if (!question.trim()) return;
setIsLoading(true);
setError(null);
try {
const res = await fetch('/api/knowledge', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ question }),
});
if (!res.ok) {
setData(null);
setError(`Search failed (${res.status})`);
return;
}
setData(await res.json());
} catch {
setData(null);
setError('Network error');
} finally {
setIsLoading(false);
}
};
return (
<div style={{ padding: '2rem', maxWidth: '700px' }}>
<h1>Knowledge Search</h1>
<div style={{ display: 'flex', gap: '0.5rem', marginBottom: '1.5rem' }}>
<input
type="text"
value={question}
onChange={(e) => setQuestion(e.target.value)}
onKeyDown={(e) => e.key === 'Enter' && handleSearch()}
placeholder="Ask a question..."
disabled={isLoading}
style={{ flex: 1, padding: '0.75rem' }}
/>
<button
onClick={handleSearch}
disabled={isLoading || !question.trim()}
>
{isLoading ? 'Searching...' : 'Search'}
</button>
</div>
{error && <p style={{ color: 'red' }}>{error}</p>}
{data && (
<div>
<div style={{ marginBottom: '1rem' }}>
<p style={{ fontSize: '1.1rem', lineHeight: 1.6 }}>{data.answer}</p>
</div>
<div style={{ marginBottom: '1rem', color: '#666' }}>
Confidence: {Math.round(data.confidence * 100)}%
</div>
{data.sources.length > 0 && (
<div>
<h3>Sources</h3>
<ul>
{data.sources.map((source) => (
<li key={source.id}>
<strong>{source.title}</strong>
<span style={{ color: '#666', marginLeft: '0.5rem' }}>
({Math.round(source.relevance * 100)}% relevant)
</span>
</li>
))}
</ul>
</div>
)}
</div>
)}
</div>
);
}Render the app directly:
import { StrictMode } from 'react';
import { createRoot } from 'react-dom/client';
import { App } from './App';
const root = document.getElementById('root');
if (!root) {
throw new Error('Root element not found');
}
createRoot(root).render(
<StrictMode>
<App />
</StrictMode>
);Summary
| Concept | Description |
|---|---|
| Vector Search | Find semantically similar documents using ctx.vector.search() |
| Context Building | Format search results into LLM-readable context with citations |
| Similarity Threshold | Filter results by minimum similarity score (e.g., 0.7) |
| Confidence Score | Calculate from average similarity of retrieved documents |
| Indexing Agent | Separate agent to populate the vector database with documents |
For RAG apps, use LLM as a Judge to check whether generated answers are grounded in the retrieved sources.
Next Steps
- LLM as a Judge: Review answer quality with model-based checks
- Streaming Responses: Stream longer answers as they are generated
- Vector Storage: Add metadata filters and advanced search options