How to Configure Different Embedding Providers in agentmemory: 7 Provider Setup Guide
agentmemory supports seven embedding providers—including OpenAI, Cohere, Voyage, and local Xenova transformers—selected at runtime via the AGENTMEMORY_EMBEDDING_PROVIDER environment variable and instantiated through the createEmbeddingProvider() factory without code changes.
The open-source rohitg00/agentmemory repository implements a pluggable embedding architecture that decouples vector generation from storage logic. By setting environment variables parsed by loadEmbeddingConfig() in src/config.ts, you can target cloud APIs or run entirely local embeddings while maintaining a consistent interface defined in src/types.ts.
Provider Architecture and Entry Points
The Factory Pattern in src/providers/index.ts
At runtime, agentmemory discovers and instantiates your chosen provider through the createEmbeddingProvider() function exported from src/providers/index.ts. This factory reads the global configuration object—populated by loadEmbeddingConfig()—and returns a concrete implementation matching the EmbeddingProvider interface.
The factory pattern centralizes provider selection logic, ensuring the rest of the codebase interacts with a uniform API regardless of whether you are using OpenAI or a local model. During startup in src/index.ts, the system initializes the provider and logs its name and output dimensions to confirm successful configuration.
EmbeddingProvider Interface Specifications
All providers must implement the EmbeddingProvider interface defined in src/types.ts. This contract requires three properties:
name: string– Human-readable identifier (e.g.,"OpenAI","Local").dimensions: number– Vector size output (e.g.,1536for OpenAI small models,768for some local models).embed(text: string | string[]): Promise<Float32Array[]>– Async method accepting single strings or batches and returning typed float arrays.
This standardization allows agentmemory to perform hybrid search operations, combining BM25 keyword scores with vector similarity without knowing which specific model generated the embeddings.
Supported Embedding Providers
Agentmemory ships with seven first-class provider implementations located in src/providers/embedding/. Each encapsulates the HTTP client logic or local inference runtime required to convert text into vectors.
Cloud API Providers
Configure these providers by setting their respective API keys and optional model names:
- OpenAI (
src/providers/embedding/openai.ts): RequiresOPENAI_EMBEDDING_API_KEY. Optionally setOPENAI_EMBEDDING_MODELto override the defaulttext-embedding-3-small(e.g., usetext-embedding-3-largefor 3072 dimensions). - OpenRouter (
src/providers/embedding/openrouter.ts): RequiresOPENROUTER_API_KEY. Defaults toopenai/text-embedding-3-smallunlessOPENROUTER_EMBEDDING_MODELis specified. - Voyage AI (
src/providers/embedding/voyage.ts): RequiresVOYAGE_API_KEY. No default model is assumed; you must configure the specific Voyage model via environment variables. - Cohere (
src/providers/embedding/cohere.ts): RequiresCOHERE_API_KEY. Optionally setCOHERE_EMBEDDING_MODEL(e.g.,embed-english-v3.0). - Google Gemini (
src/providers/embedding/gemini.ts): RequiresGEMINI_API_KEY.
Local Inference Providers
These providers run entirely within your infrastructure without external API calls:
- Local Transformers (
src/providers/embedding/local.ts): Uses the@xenova/transformerslibrary to run embedding models locally. No API key is required, but you must install the peer dependency withnpm install @xenova/transformers. - CLIP Image Embeddings (
src/providers/embedding/clip.ts): Specialized for image vectorization using CLIP models via@xenova/transformers. Useful for multimodal search when combined with text providers.
Configuration via Environment Variables
Primary Selection Variable
The AGENTMEMORY_EMBEDDING_PROVIDER environment variable controls which factory implementation is instantiated. Valid values include: openai, openrouter, voyage, cohere, gemini, local, and clip.
If this variable is omitted or set to an unsupported value, agentmemory may fall back to BM25-only mode or throw a configuration error depending on your version.
Provider-Specific Requirements
Set these variables before starting your application:
# Option 1: OpenAI configuration
export AGENTMEMORY_EMBEDDING_PROVIDER=openai
export OPENAI_EMBEDDING_API_KEY=sk-proj-...
export OPENAI_EMBEDDING_MODEL=text-embedding-3-large # Optional
# Option 2: Cohere configuration
export AGENTMEMORY_EMBEDDING_PROVIDER=cohere
export COHERE_API_KEY=...
export COHERE_EMBEDDING_MODEL=embed-english-v3.0
# Option 3: Local transformer (no external key)
export AGENTMEMORY_EMBEDDING_PROVIDER=local
npm install @xenova/transformers
Loading Configuration with loadEmbeddingConfig
The loadEmbeddingConfig() function in src/config.ts parses these environment variables at startup and returns a typed EmbeddingConfig object. This configuration is stored globally and consumed by createEmbeddingProvider() to parameterize the chosen embedding class.
If you are running agentmemory programmatically, ensure environment variables are set before importing the main entry point, as the provider instantiation occurs immediately in src/index.ts.
Implementation Examples
Switching Providers Without Code Changes
Because provider selection happens via environment variables, you can swap between OpenAI and local embeddings by restarting the process with different flags:
# Run with OpenAI (cloud)
AGENTMEMORY_EMBEDDING_PROVIDER=openai \
OPENAI_EMBEDDING_API_KEY=sk-... \
npm start
# Run with local embeddings (offline)
AGENTMEMORY_EMBEDDING_PROVIDER=local \
npm start
The startup log will confirm the active provider:
[agentmemory] Embedding provider: OpenAI (1536 dims)
Programmatic Provider Access
If you need to access the provider instance directly for custom embedding operations, import the factory:
import { createEmbeddingProvider } from "./src/providers/index.js";
async function embedQuery(query: string) {
const provider = createEmbeddingProvider();
if (!provider) {
throw new Error("Embedding provider not configured");
}
// Returns Float32Array[] with length matching input array length
const [vector] = await provider.embed(query);
return vector; // Float32Array of size provider.dimensions
}
Hybrid Search Configuration
When configuring agentmemory for hybrid search (BM25 + vector), the system uses the dimensions property from your active provider to initialize the VectorIndex. Ensure your selected provider's output dimensions match the index configuration to avoid runtime shape mismatches.
Summary
- Factory Location:
createEmbeddingProvider()insrc/providers/index.tsinstantiates providers based on theAGENTMEMORY_EMBEDDING_PROVIDERenvironment variable. - Configuration Source:
loadEmbeddingConfig()insrc/config.tsparses API keys and model names into a typedEmbeddingConfig. - Interface Contract: All providers implement
EmbeddingProviderfromsrc/types.ts, exposingname,dimensions, and anembed()method returningPromise<Float32Array[]>. - Available Providers: OpenAI, OpenRouter, Voyage, Cohere, Gemini, Local (Xenova), and CLIP image embeddings, each located in
src/providers/embedding/*.ts. - Zero-Code Swapping: Change providers by modifying environment variables and restarting; no source file edits are required to switch from cloud to local inference.
Frequently Asked Questions
Which embedding provider should I use for production workloads?
OpenAI's text-embedding-3-large or Voyage AI generally provide the highest retrieval accuracy for general text tasks, but require valid API keys and incur usage costs. For air-gapped environments or cost-sensitive applications, the Local provider using @xenova/transformers offers comparable functionality without network dependencies, though with higher latency on CPU-only machines.
Can I run agentmemory without any external API keys?
Yes, by setting AGENTMEMORY_EMBEDDING_PROVIDER=local, agentmemory will use the Xenova Transformers library to generate embeddings locally. You must install @xenova/transformers as a dependency, and the first run will download the default model files to your local cache. This configuration supports full offline operation.
How do I implement a custom embedding provider?
Create a new file in src/providers/embedding/ that implements the EmbeddingProvider interface defined in src/types.ts. Your class must provide a name, dimensions number, and an embed() method. Then modify createEmbeddingProvider() in src/providers/index.ts to instantiate your class when a specific environment variable value is detected (e.g., AGENTMEMORY_EMBEDDING_PROVIDER=custom).
What vector dimensions does each provider output?
Dimensions vary by provider and model: OpenAI's text-embedding-3-small outputs 1536 dimensions, while text-embedding-3-large outputs 3072. Cohere and Voyage dimensions depend on the specific model selected (e.g., 1024 or 768). The Local provider's dimensions depend on the specific Xenova model loaded. Check provider.dimensions at runtime or consult the provider source file in src/providers/embedding/ to confirm the expected vector size for your configuration.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →