Using Vector DB
Using the Vector DB for search and retrieval
Vector storage enables semantic search for your agents, allowing them to find information by meaning rather than keywords. Ideal for knowledge bases, RAG systems, and persistent agent memory.
Understanding Vector Storage
Vector storage works by converting text into high-dimensional numerical representations (embeddings) that capture semantic meaning. When you search, the system finds documents with similar meanings rather than just keyword matches.
Key use cases:
- Knowledge bases and documentation search
- Long-term memory across agent sessions
- RAG systems combining retrieval with AI generation
- Semantic similarity search
Managing Vector Instances
Viewing Vector Storage in the Cloud Console
Navigate to Services > Vector in the Agentuity Cloud Console to view all your vector storage instances. The interface shows:
- Database Name: The identifier for your vector storage
- Projects: Which projects are using this storage
- Agents: Which agents have access
- Size: Storage utilization
You can filter instances by name using the search box and create new vector storage instances with the Create Storage button.
Creating Vector Storage
You can create vector storage either through the Cloud Console or programmatically in your agent code.
Via Cloud Console
Navigate to Services > Vector and click Create Storage. Choose a descriptive name that reflects the storage purpose (e.g., knowledge-base
, agent-memory
, product-catalog
).
Via SDK
Vector storage is created automatically when your agent first calls context.vector.upsert()
with an instance name:
Vector Storage API
For complete API documentation, see:
Upserting Documents
The upsert
operation inserts new documents or updates existing ones. You can provide either text (which gets automatically converted to embeddings) or pre-computed embeddings.
SDK Requirements:
- Both SDKs: Require a
key
field for each document
Idempotent Behavior: The upsert operation is idempotent - upserting with an existing key updates the existing vector rather than creating a duplicate. The same internal vector ID is reused, ensuring your vector storage remains clean and efficient.
Searching Vector Storage
Search operations find semantically similar documents based on a text query. You can control the number of results, similarity threshold, and filter by metadata.
Search Parameters:
query
(required): Text query to search forlimit
(optional): Maximum number of results to returnsimilarity
(optional): Minimum similarity threshold (0.0-1.0)metadata
(optional): Filter results by metadata key-value pairs
Search Results:
- Both SDKs: Return results with
similarity
field (1.0 = perfect match, 0.0 = no match) - Note: The JavaScript SDK also returns a
distance
field for backward compatibility; prefersimilarity
Deleting Vectors
Remove specific vectors from storage using their keys.
Practical Examples
For more code examples, see:
Building a Simple RAG System
This example demonstrates a complete Retrieval-Augmented Generation (RAG) pattern - searching for relevant context and using it to generate informed responses.
// JavaScript/TypeScript - Simple RAG implementation
import { AgentHandler } from '@agentuity/sdk';
const handler: AgentHandler = async (request, response, context) => {
const { question } = await request.data.json();
try {
// 1. Search for relevant context (top 5 results)
const searchResults = await context.vector.search('knowledge-base', {
query: question,
limit: 5,
similarity: 0.7
});
// 2. Handle no results gracefully
if (searchResults.length === 0) {
return response.json({
answer: "I couldn't find relevant information to answer your question.",
sources: []
});
}
// 3. Assemble context from search results (defensive handling)
const contextTexts = searchResults.map(result =>
result.metadata?.content ?? result.metadata?.text ?? ''
);
const assembledContext = contextTexts.join('\n\n');
// 4. Generate response using context (example with AI Gateway)
const prompt = `Answer the question based on the following context:
Context: ${assembledContext}
Question: ${question}
Answer:`;
// Use your preferred LLM here (OpenAI, Anthropic, etc.)
const llmResponse = await generateAnswer(prompt);
// 5. Return answer with sources
return response.json({
answer: llmResponse,
sources: searchResults.map(r => ({
id: r.id,
key: r.key,
title: r.metadata?.title,
similarity: r.similarity
}))
});
} catch (error) {
context.logger.error('RAG query failed:', error);
return response.json({
error: 'Failed to process your question',
details: error.message
});
}
};
export default handler;
# Python - Simple RAG implementation
from agentuity import AgentRequest, AgentResponse, AgentContext
async def run(request: AgentRequest, response: AgentResponse, context: AgentContext):
data = await request.data.json()
question = data.get("question")
try:
# 1. Search for relevant context (top 5 results)
search_results = await context.vector.search(
"knowledge-base",
query=question,
limit=5,
similarity=0.7
)
# 2. Handle no results gracefully
if not search_results:
return response.json({
"answer": "I couldn't find relevant information to answer your question.",
"sources": []
})
# 3. Assemble context from search results
context_texts = [
result.metadata.get("content", result.metadata.get("text", ""))
for result in search_results
]
assembled_context = "\n\n".join(context_texts)
# 4. Generate response using context (example with AI Gateway)
prompt = f"""Answer the question based on the following context:
Context: {assembled_context}
Question: {question}
Answer:"""
# Use your preferred LLM here (OpenAI, Anthropic, etc.)
llm_response = await generate_answer(prompt)
# 5. Return answer with sources
return response.json({
"answer": llm_response,
"sources": [
{
"id": result.id,
"key": result.key,
"title": result.metadata.get("title"),
"similarity": result.similarity
}
for result in search_results
]
})
except Exception as e:
context.logger.error(f"RAG query failed: {e}")
return response.json({
"error": "Failed to process your question",
"details": str(e)
})
Key Points:
- Semantic search finds relevant documents based on meaning, not keywords
- Similarity threshold of 0.7 balances relevance with recall
- Context assembly combines multiple sources for comprehensive answers
- Error handling ensures graceful failures with helpful messages
- Source attribution provides transparency about where information came from
Semantic Search with Metadata Filtering
This example shows how to combine semantic similarity with metadata filters for precise results - like finding products that match both meaning and business criteria.
// JavaScript/TypeScript - Product search with filters
import { AgentHandler } from '@agentuity/sdk';
const handler: AgentHandler = async (request, response, context) => {
const { query, maxPrice, category, inStock } = await request.data.json();
try {
// Build metadata filters based on criteria
const metadataFilters = {};
if (category) metadataFilters.category = category;
if (inStock !== undefined) metadataFilters.inStock = inStock;
// Search with semantic similarity + metadata filters
const searchResults = await context.vector.search('products', {
query,
limit: 10,
similarity: 0.65, // Lower threshold for broader results
metadata: metadataFilters
});
// Post-process: Apply price filter and sort by relevance
const filteredResults = searchResults
.filter(result => !maxPrice || result.metadata.price <= maxPrice)
.map(result => ({
...result.metadata,
similarity: result.similarity
}))
.sort((a, b) => b.similarity - a.similarity);
return response.json({
query,
filters: { maxPrice, category, inStock },
resultCount: filteredResults.length,
products: filteredResults.slice(0, 5) // Top 5 results
});
} catch (error) {
context.logger.error('Product search failed:', error);
return response.json({
error: 'Search failed',
products: []
});
}
};
export default handler;
# Python - Product search with filters
from agentuity import AgentRequest, AgentResponse, AgentContext
async def run(request: AgentRequest, response: AgentResponse, context: AgentContext):
data = await request.data.json()
query = data.get("query")
max_price = data.get("maxPrice")
category = data.get("category")
in_stock = data.get("inStock")
try:
# Build metadata filters based on criteria
metadata_filters = {}
if category:
metadata_filters["category"] = category
if in_stock is not None:
metadata_filters["inStock"] = in_stock
# Search with semantic similarity + metadata filters
search_results = await context.vector.search(
"products",
query=query,
limit=10,
similarity=0.65, # Lower threshold for broader results
metadata=metadata_filters
)
# Post-process: Apply price filter and sort by relevance
filtered_results = []
for result in search_results:
# Apply price filter
if max_price and result.metadata.get("price", 0) > max_price:
continue
product = dict(result.metadata)
product["similarity"] = result.similarity
filtered_results.append(product)
# Sort by similarity score
filtered_results.sort(key=lambda x: x["similarity"], reverse=True)
return response.json({
"query": query,
"filters": {"maxPrice": max_price, "category": category, "inStock": in_stock},
"resultCount": len(filtered_results),
"products": filtered_results[:5] # Top 5 results
})
except Exception as e:
context.logger.error(f"Product search failed: {e}")
return response.json({
"error": "Search failed",
"products": []
})
Key Techniques:
- Metadata filters are applied at the vector search level for efficiency
- Post-processing handles filters that can't be done at search time (like price ranges)
- Lower similarity threshold (0.65) catches more potential matches when using strict filters
Common Pitfalls & Solutions
Empty Search Results
Problem: Your search returns empty results even though relevant data exists.
Solutions:
- Lower the similarity threshold: Start at 0.5 and increase gradually
- Check your metadata filters: They use exact matching, not fuzzy matching
- Verify document format: Ensure documents were upserted with text content
// Adaptive threshold example
let results = await context.vector.search('kb', {
query,
similarity: 0.8
});
if (results.length === 0) {
// Try again with lower threshold
results = await context.vector.search('kb', {
query,
similarity: 0.5
});
}
Duplicate Documents
Problem: Same content appears multiple times in search results.
Solution: Vector upsert is idempotent when using the same key:
- Always use consistent
key
values for your documents - Upserting with an existing key updates the vector rather than creating a duplicate
- The same internal vector ID is reused, keeping your storage clean
Performance Issues
Problem: Vector operations take too long.
Solutions:
- Batch operations: Upsert 100-500 documents at once, not one by one
- Limit search results: Use
limit: 10
instead of retrieving all matches - Optimize metadata: Keep metadata objects small and focused
Irrelevant Search Results
Problem: Search returns irrelevant or unexpected documents.
Solutions:
- Check similarity scores: Results with similarity < 0.7 may be poor matches
- Review metadata filters: Remember they're AND conditions, not OR
- Verify embeddings: Ensure consistent text preprocessing before upserting
Best Practices
Document Structure
- Include context in documents: Store enough context so documents are meaningful when retrieved
- Use descriptive metadata: Include relevant metadata for filtering and identification
- Consistent formatting: Use consistent document formatting for better embeddings
Search Optimization
- Adjust similarity thresholds: Start with 0.7 and adjust based on result quality
- Use metadata filtering: Combine semantic search with metadata filters for precise results
- Limit result sets: Use appropriate limits to balance performance and relevance
Performance Considerations
- Batch upsert operations: Use bulk upsert instead of individual calls
- Monitor storage usage: Track vector storage size in the Cloud Console
- Consider document chunking: Break large documents into smaller, focused chunks
Integration with Agent Memory
Vector storage serves as long-term memory for agents, enabling them to:
- Remember past conversations and context across sessions
- Access organizational knowledge bases
- Retrieve relevant examples for few-shot learning
- Build and maintain agent-specific knowledge repositories
For more information on memory patterns, see the Key-Value Storage guide for short-term memory or explore Agent Communication for sharing knowledge between agents.
Storage Types Overview
Need Help?
Join our Community for assistance or just to hang with other humans building agents.
Send us an email at hi@agentuity.com if you'd like to get in touch.
Please Follow us on
If you haven't already, please Signup for your free account now and start building your first agent!