Agentuity Docs

When to Use Vector Storage

Vector storage enables semantic search for your agents, allowing them to find information by meaning rather than keywords. Ideal for knowledge bases, RAG systems, and persistent agent memory.

Choose the right storage for your use case:

Vector Storage: Semantic search, embeddings, similarity matching
Key-Value Storage: Fast lookups, simple data, temporary state
Object Storage: Large files, media, backups

Vector storage works by converting text into high-dimensional numerical representations (embeddings) that capture semantic meaning. When you search, the system finds documents with similar meanings rather than just keyword matches.

Key use cases:

Knowledge bases and documentation search
Long-term memory across agent sessions
RAG systems combining retrieval with AI generation
Semantic similarity search

Managing Vector Instances

Viewing Vector Storage in the Cloud Console

Navigate to Services > Vector in the Agentuity Cloud Console to view all your vector storage instances. The interface shows:

Database Name: The identifier for your vector storage
Projects: Which projects are using this storage
Agents: Which agents have access
Size: Storage utilization

You can filter instances by name using the search box and create new vector storage instances with the Create Storage button.

Creating Vector Storage

You can create vector storage either through the Cloud Console or programmatically in your agent code.

Via Cloud Console

Navigate to Services > Vector and click Create Storage. Choose a descriptive name that reflects the storage purpose (e.g., knowledge-base, agent-memory, product-catalog).

Via SDK

Vector storage is created automatically when your agent first calls context.vector.upsert() with an instance name:

Vector Storage API

For complete API documentation, see:

Upserting Documents

The upsert operation inserts new documents or updates existing ones. You can provide either text (which gets automatically converted to embeddings) or pre-computed embeddings.

SDK Requirements:

Both SDKs: Require a key field for each document

Idempotent Behavior: The upsert operation is idempotent - upserting with an existing key updates the existing vector rather than creating a duplicate. The same internal vector ID is reused, ensuring your vector storage remains clean and efficient.

// JavaScript/TypeScript
// Upsert documents with text (automatic embedding)
const ids = await context.vector.upsert(
'knowledge-base',
{ 
  key: 'doc-1',
  document: 'Agentuity is an agent-native cloud platform', 
  metadata: { category: 'platform', source: 'docs' } 
},
{ 
  key: 'doc-2',
  document: 'Vector storage enables semantic search capabilities', 
  metadata: { category: 'features', source: 'docs' } 
}
);

// Upsert with pre-computed embeddings
const embeddingIds = await context.vector.upsert(
'custom-embeddings',
{ 
  key: 'embedding-1',
  embeddings: [0.1, 0.2, 0.3, 0.4], 
  metadata: { id: 'doc-1', type: 'custom' } 
}
);

# Python
# Upsert documents with text
documents = [
  {
      "key": "doc_1",  # Required: unique identifier for this vector
      "document": "Agentuity is an agent-native cloud platform",
      "metadata": {"category": "platform", "source": "docs"}
  },
  {
      "key": "doc_2",  # Required: unique identifier for this vector
      "document": "Vector storage enables semantic search capabilities", 
      "metadata": {"category": "features", "source": "docs"}
  }
]

ids = await context.vector.upsert("knowledge-base", documents)

# Upsert with embeddings
embedding_docs = [
  {
      "key": "embedding_1",  # Required: unique identifier for this vector
      "embeddings": [0.1, 0.2, 0.3, 0.4],
      "metadata": {"id": "doc-1", "type": "custom"}
  }
]

ids = await context.vector.upsert("custom-embeddings", embedding_docs)

Searching Vector Storage

Search operations find semantically similar documents based on a text query. You can control the number of results, similarity threshold, and filter by metadata.

// JavaScript/TypeScript
// Basic semantic search
const results = await context.vector.search('knowledge-base', {
query: 'What is an agent platform?',
limit: 5,
similarity: 0.7,
metadata: { category: 'platform' }
});

// Process results
results.forEach(result => {
console.log(`Found: ${result.metadata.source}`);
console.log(`Similarity: ${result.similarity}`);
});

# Python
# Semantic search with parameters
results = await context.vector.search(
  "knowledge-base",
  query="What is an agent platform?",
  limit=5,
  similarity=0.7,
  metadata={"category": "platform"}
)

# Process results
for result in results:
  print(f"Found: {result.metadata['source']}")
  print(f"Similarity: {result.similarity}")

Search Parameters:

query (required): Text query to search for
limit (optional): Maximum number of results to return
similarity (optional): Minimum similarity threshold (0.0-1.0)
metadata (optional): Filter results by metadata key-value pairs

Search Results:

Both SDKs: Return results with similarity field (1.0 = perfect match, 0.0 = no match)
Note: The JavaScript SDK also returns a distance field for backward compatibility; prefer similarity

Retrieving Vectors by Key

The get method retrieves a specific vector directly using its key, without performing a similarity search.

// JavaScript/TypeScript
// Direct lookup by key
const vector = await context.vector.get('knowledge-base', 'doc-1');

if (vector) {
console.log(`Found: ${vector.id}`);
console.log(`Key: ${vector.key}`);
console.log('Metadata:', vector.metadata);
} else {
console.log('Vector not found');
}

// Common use cases:

// 1. Check if a vector exists before updating
const existing = await context.vector.get('products', 'product-123');
if (existing) {
// Update with merged metadata
await context.vector.upsert('products', {
  key: 'product-123',
  document: 'Updated product description',
  metadata: { ...(existing.metadata ?? {}), lastUpdated: Date.now() }
});
}

// 2. Retrieve full metadata after search
const searchResults = await context.vector.search('products', {
query: 'office chair',
limit: 5
});

// Get complete details for the top result
if (searchResults[0]) {
const fullDetails = await context.vector.get('products', searchResults[0].key);
console.log('Full metadata:', fullDetails?.metadata);
}

# Python
import time

# Direct lookup by key
vector = await context.vector.get("knowledge-base", "doc-1")

if vector:
  print(f"Found: {vector.id}")
  print(f"Key: {vector.key}")
  print(f"Metadata: {vector.metadata}")
else:
  print("Vector not found")

# Common use cases:

# 1. Check if a vector exists before updating
existing = await context.vector.get("products", "product-123")
if existing:
  # Update with merged metadata
  metadata = existing.metadata or {}
  await context.vector.upsert("products", [{
      "key": "product-123", 
      "document": "Updated product description",
      "metadata": {**metadata, "lastUpdated": int(time.time() * 1000)}
  }])

# 2. Retrieve full metadata after search
search_results = await context.vector.search(
  "products",
  query="office chair",
  limit=5
)

# Get complete details for the top result
if search_results:
  full_details = await context.vector.get("products", search_results[0].key)
  if full_details:
      print(f"Full metadata: {full_details.metadata}")

When to use get vs search:

Use get when you know the exact key (like a database primary key lookup)
Use search when finding vectors by semantic similarity
get is faster for single lookups since it doesn't compute similarities
get returns null (JS) / None (Python) if the key is not found, but may throw errors for other failures

Deleting Vectors

Remove specific vectors from storage using their keys.

// JavaScript/TypeScript
// Delete single vector
const deletedCount = await context.vector.delete('knowledge-base', 'doc-1');

// Delete multiple vectors
const bulkDeleteCount = await context.vector.delete(
'knowledge-base', 
'doc-1', 'doc-2', 'doc-3'
);

# Python
# Delete single vector
count = await context.vector.delete("knowledge-base", "doc_1")

# Note: Python SDK currently supports single deletion
# For bulk operations, call delete multiple times

Practical Examples

For more code examples, see:

Building a Simple RAG System

This example demonstrates a complete Retrieval-Augmented Generation (RAG) pattern - searching for relevant context and using it to generate informed responses.

// JavaScript/TypeScript - Simple RAG implementation
import { AgentHandler } from '@agentuity/sdk';

const handler: AgentHandler = async (request, response, context) => {
const { question } = await request.data.json();

try {
  // 1. Search for relevant context (top 5 results)
  const searchResults = await context.vector.search('knowledge-base', {
    query: question,
    limit: 5,
    similarity: 0.7
  });
  
  // 2. Handle no results gracefully
  if (searchResults.length === 0) {
    return response.json({
      answer: "I couldn't find relevant information to answer your question.",
      sources: []
    });
  }
  
  // 3. Assemble context from search results (defensive handling)
  const contextTexts = searchResults.map(result => 
    result.metadata?.content ?? result.metadata?.text ?? ''
  );
  const assembledContext = contextTexts.join('\n\n');
  
  // 4. Generate response using context (example with AI Gateway)
  const prompt = `Answer the question based on the following context:
  
Context: ${assembledContext}

Question: ${question}

Answer:`;
  
  // Use your preferred LLM here (OpenAI, Anthropic, etc.)
  const llmResponse = await generateAnswer(prompt);
  
  // 5. Return answer with sources
  return response.json({
    answer: llmResponse,
    sources: searchResults.map(r => ({
      id: r.id,
      key: r.key,
      title: r.metadata?.title,
      similarity: r.similarity
    }))
  });
  
} catch (error) {
  context.logger.error('RAG query failed:', error);
  return response.json({ 
    error: 'Failed to process your question',
    details: error.message 
  });
}
};

export default handler;

# Python - Simple RAG implementation
from agentuity import AgentRequest, AgentResponse, AgentContext

async def run(request: AgentRequest, response: AgentResponse, context: AgentContext):
  data = await request.data.json()
  question = data.get("question")
  
  try:
      # 1. Search for relevant context (top 5 results)
      search_results = await context.vector.search(
          "knowledge-base",
          query=question,
          limit=5,
          similarity=0.7
      )
      
      # 2. Handle no results gracefully
      if not search_results:
          return response.json({
              "answer": "I couldn't find relevant information to answer your question.",
              "sources": []
          })
      
      # 3. Assemble context from search results
      context_texts = [
          result.metadata.get("content", result.metadata.get("text", ""))
          for result in search_results
      ]
      assembled_context = "\n\n".join(context_texts)
      
      # 4. Generate response using context (example with AI Gateway)
      prompt = f"""Answer the question based on the following context:
      
Context: {assembled_context}

Question: {question}

Answer:"""
      
      # Use your preferred LLM here (OpenAI, Anthropic, etc.)
      llm_response = await generate_answer(prompt)
      
      # 5. Return answer with sources
      return response.json({
          "answer": llm_response,
          "sources": [
              {
                  "id": result.id,
                  "key": result.key,
                  "title": result.metadata.get("title"),
                  "similarity": result.similarity
              }
              for result in search_results
          ]
      })
      
  except Exception as e:
      context.logger.error(f"RAG query failed: {e}")
      return response.json({
          "error": "Failed to process your question",
          "details": str(e)
      })

Key Points:

Semantic search finds relevant documents based on meaning, not keywords
Similarity threshold of 0.7 balances relevance with recall
Context assembly combines multiple sources for comprehensive answers
Error handling ensures graceful failures with helpful messages
Source attribution provides transparency about where information came from

Semantic Search with Metadata Filtering

This example shows how to combine semantic similarity with metadata filters for precise results - like finding products that match both meaning and business criteria.

// JavaScript/TypeScript - Product search with filters
import { AgentHandler } from '@agentuity/sdk';

const handler: AgentHandler = async (request, response, context) => {
const { query, maxPrice, category, inStock } = await request.data.json();

try {
  // Build metadata filters based on criteria
  const metadataFilters = {};
  if (category) metadataFilters.category = category;
  if (inStock !== undefined) metadataFilters.inStock = inStock;
  
  // Search with semantic similarity + metadata filters
  const searchResults = await context.vector.search('products', {
    query,
    limit: 10,
    similarity: 0.65, // Lower threshold for broader results
    metadata: metadataFilters
  });
  
  // Post-process: Apply price filter and sort by relevance
  const filteredResults = searchResults
    .filter(result => !maxPrice || result.metadata.price <= maxPrice)
    .map(result => ({
      ...result.metadata,
      similarity: result.similarity
    }))
    .sort((a, b) => b.similarity - a.similarity);
  
  return response.json({
    query,
    filters: { maxPrice, category, inStock },
    resultCount: filteredResults.length,
    products: filteredResults.slice(0, 5) // Top 5 results
  });
  
} catch (error) {
  context.logger.error('Product search failed:', error);
  return response.json({ 
    error: 'Search failed',
    products: [] 
  });
}
};

export default handler;

# Python - Product search with filters
from agentuity import AgentRequest, AgentResponse, AgentContext

async def run(request: AgentRequest, response: AgentResponse, context: AgentContext):
  data = await request.data.json()
  query = data.get("query")
  max_price = data.get("maxPrice")
  category = data.get("category")
  in_stock = data.get("inStock")
  
  try:
      # Build metadata filters based on criteria
      metadata_filters = {}
      if category:
          metadata_filters["category"] = category
      if in_stock is not None:
          metadata_filters["inStock"] = in_stock
      
      # Search with semantic similarity + metadata filters
      search_results = await context.vector.search(
          "products",
          query=query,
          limit=10,
          similarity=0.65,  # Lower threshold for broader results
          metadata=metadata_filters
      )
      
      # Post-process: Apply price filter and sort by relevance
      filtered_results = []
      for result in search_results:
          # Apply price filter
          if max_price and result.metadata.get("price", 0) > max_price:
              continue
              
          product = dict(result.metadata)
          product["similarity"] = result.similarity
          filtered_results.append(product)
      
      # Sort by similarity score
      filtered_results.sort(key=lambda x: x["similarity"], reverse=True)
      
      return response.json({
          "query": query,
          "filters": {"maxPrice": max_price, "category": category, "inStock": in_stock},
          "resultCount": len(filtered_results),
          "products": filtered_results[:5]  # Top 5 results
      })
      
  except Exception as e:
      context.logger.error(f"Product search failed: {e}")
      return response.json({
          "error": "Search failed",
          "products": []
      })

Key Techniques:

Metadata filters are applied at the vector search level for efficiency
Post-processing handles filters that can't be done at search time (like price ranges)
Lower similarity threshold (0.65) catches more potential matches when using strict filters

Common Pitfalls & Solutions

Empty Search Results

Problem: Your search returns empty results even though relevant data exists.

Solutions:

Lower the similarity threshold: Start at 0.5 and increase gradually
Check your metadata filters: They use exact matching, not fuzzy matching
Verify document format: Ensure documents were upserted with text content

// Adaptive threshold example
let results = await context.vector.search('kb', { 
  query, 
  similarity: 0.8 
});
 
if (results.length === 0) {
  // Try again with lower threshold
  results = await context.vector.search('kb', { 
    query, 
    similarity: 0.5 
  });
}

Duplicate Documents

Problem: Same content appears multiple times in search results.

Solution: Vector upsert is idempotent when using the same key:

Always use consistent key values for your documents
Upserting with an existing key updates the vector rather than creating a duplicate
The same internal vector ID is reused, keeping your storage clean

Performance Issues

Problem: Vector operations take too long.

Solutions:

Batch operations: Upsert 100-500 documents at once, not one by one
Limit search results: Use limit: 10 instead of retrieving all matches
Optimize metadata: Keep metadata objects small and focused

Irrelevant Search Results

Problem: Search returns irrelevant or unexpected documents.

Solutions:

Check similarity scores: Results with similarity < 0.7 may be poor matches
Review metadata filters: Remember they're AND conditions, not OR
Verify embeddings: Ensure consistent text preprocessing before upserting

Best Practices

Document Structure

Include context in documents: Store enough context so documents are meaningful when retrieved
Use descriptive metadata: Include relevant metadata for filtering and identification
Consistent formatting: Use consistent document formatting for better embeddings

Search Optimization

Adjust similarity thresholds: Start with 0.7 and adjust based on result quality
Use metadata filtering: Combine semantic search with metadata filters for precise results
Limit result sets: Use appropriate limits to balance performance and relevance

Performance Considerations

Batch upsert operations: Use bulk upsert instead of individual calls
Monitor storage usage: Track vector storage size in the Cloud Console
Consider document chunking: Break large documents into smaller, focused chunks

Integration with Agent Memory

Vector storage serves as long-term memory for agents, enabling them to:

Remember past conversations and context across sessions
Access organizational knowledge bases
Retrieve relevant examples for few-shot learning
Build and maintain agent-specific knowledge repositories

For more information on memory patterns, see the Key-Value Storage guide for short-term memory or explore Agent Communication for sharing knowledge between agents.

Storage Types Overview

Need Help?

Join our Community for assistance or just to hang with other humans building agents.

Send us an email at hi@agentuity.com if you'd like to get in touch.

Please Follow us on

If you haven't already, please Signup for your free account now and start building your first agent!

Using Vector DB

When to Use Vector Storage

Understanding Vector Storage

Managing Vector Instances

Viewing Vector Storage in the Cloud Console

Creating Vector Storage

Via Cloud Console

Via SDK

Vector Storage API

Upserting Documents

Searching Vector Storage

Retrieving Vectors by Key

Deleting Vectors

Practical Examples

Building a Simple RAG System

Semantic Search with Metadata Filtering

Common Pitfalls & Solutions

Empty Search Results

Duplicate Documents

Performance Issues

Irrelevant Search Results

Best Practices

Document Structure

Search Optimization

Performance Considerations

Integration with Agent Memory

Storage Types Overview

Need Help?

On this page