# How to Use Rerankers in Mem0 to Improve Memory Retrieval Accuracy

> Boost memory retrieval accuracy with Mem0 rerankers. Learn to integrate Cohere, Sentence-Transformers, and LLM options to reorder search results and find relevant memories faster.

- Repository: [Mem0/mem0](https://github.com/mem0ai/mem0)
- Tags: how-to-guide
- Published: 2026-03-07

---

**Mem0 supports pluggable rerankers—including Cohere, Sentence-Transformers, and LLM-based options—that reorder raw vector search results to surface the most relevant memories.**

Mem0 is an open-source memory layer for AI applications that stores and retrieves contextual information using vector stores and optional graph structures. While vector similarity search efficiently retrieves candidate memories, the initial results may not always rank by true semantic relevance to the query. By configuring **rerankers in Mem0**, you can apply a secondary scoring pass that significantly improves retrieval precision without modifying the underlying search implementation.

## Architecture of Rerankers in Mem0

The reranking system follows a factory pattern with clear separation between configuration, instantiation, and execution.

### Core Components

- **`MemoryConfig`** – Located in [`mem0/configs/base.py`](https://github.com/mem0ai/mem0/blob/main/mem0/configs/base.py), this dataclass holds an optional `reranker` field that specifies which provider to use and its configuration.
- **`RerankerConfig`** – Defined in [`mem0/configs/rerankers/config.py`](https://github.com/mem0ai/mem0/blob/main/mem0/configs/rerankers/config.py), this generic container normalizes provider-specific settings into a standard format consumed by the factory.
- **`RerankerFactory`** – Implemented in [`mem0/utils/factory.py`](https://github.com/mem0ai/mem0/blob/main/mem0/utils/factory.py) (lines 240–283), this utility maps provider names (e.g., `"cohere"`, `"sentence_transformer"`) to concrete class imports and instantiates them with the appropriate configuration.
- **`BaseReranker`** – The abstract interface in [`mem0/reranker/base.py`](https://github.com/mem0ai/mem0/blob/main/mem0/reranker/base.py) guarantees that every implementation exposes a uniform `rerank(query, documents, top_k)` method.
- **Concrete Implementations** – Provider-specific classes such as `CohereReranker` ([`mem0/reranker/cohere_reranker.py`](https://github.com/mem0ai/mem0/blob/main/mem0/reranker/cohere_reranker.py)), `SentenceTransformerReranker` ([`mem0/reranker/sentence_transformer_reranker.py`](https://github.com/mem0ai/mem0/blob/main/mem0/reranker/sentence_transformer_reranker.py)), and `LLMReranker` ([`mem0/reranker/llm_reranker.py`](https://github.com/mem0ai/mem0/blob/main/mem0/reranker/llm_reranker.py)) handle the actual scoring logic.

### Retrieval Flow

When you call `Memory.search()`, the process unfolds as follows:

1. **Initial Retrieval** – The system queries the configured vector store (and optional graph store) to fetch an initial candidate list.
2. **Conditional Reranking** – In [`mem0/memory/main.py`](https://github.com/mem0ai/mem0/blob/main/mem0/memory/main.py) (lines 45–52), the code checks if `rerank=True` (the default) and if `self.reranker` exists. If both conditions pass, it invokes `self.reranker.rerank(query, original_memories, limit)`.
3. **Result Enhancement** – The reranker returns the same document list augmented with a `rerank_score` field, sorted by this new relevance metric. If the reranking call fails due to network issues or missing credentials, the system catches the exception and returns the original results with a default score of `0.0`, ensuring the search never crashes.

## Supported Reranker Providers

Mem0 ships with built-in support for five distinct reranking strategies, each suited to different deployment environments and latency requirements.

| Provider | Required Dependencies | Key Configuration Parameters |
|----------|----------------------|------------------------------|
| **cohere** | `cohere` Python package + `COHERE_API_KEY` environment variable | `model`, `top_k`, `return_documents`, `max_chunks_per_doc` |
| **sentence_transformer** | `sentence-transformers` | `model_name`, `top_k` |
| **huggingface** | `transformers` + `torch` | `model`, `top_k` |
| **zero_entropy** | `zeroentropy` | `model`, `top_k` |
| **llm_reranker** | Configured LLM from `mem0.llms` | `temperature`, `max_tokens`, `top_k` |

Provider-specific configuration classes reside in `mem0/configs/rerankers/<provider>.py`, allowing you to fine-tune API timeouts, model versions, and batch sizes.

## Configuring Rerankers

To enable reranking, pass a `reranker` dictionary to `MemoryConfig` during client initialization. The dictionary requires a `provider` string and a `config` object containing provider-specific settings.

### Cohere Reranker Setup

The Cohere provider is ideal for production deployments requiring state-of-the-art neural reranking without local GPU resources.

```python
from mem0 import Memory, MemoryConfig
from mem0.configs.rerankers.cohere import CohereRerankerConfig

# Configure the reranker

cohere_config = CohereRerankerConfig(
    model="rerank-english-v2.0",
    top_k=5,
    return_documents=True,
    max_chunks_per_doc=10,
    api_key="YOUR_COHERE_API_KEY"  # Optional: falls back to COHERE_API_KEY env var

)

# Initialize Memory with reranker enabled

config = MemoryConfig(
    reranker={
        "provider": "cohere",
        "config": cohere_config.model_dump()
    }
)

mem = Memory(config)

```

### Local Sentence-Transformer Reranker

For privacy-sensitive applications or offline environments, use the Sentence-Transformer provider to run cross-encoder models locally.

```python
from mem0 import Memory, MemoryConfig
from mem0.configs.rerankers.sentence_transformer import SentenceTransformerRerankerConfig

st_config = SentenceTransformerRerankerConfig(
    model_name="cross-encoder/ms-marco-MiniLM-L-12-v2",
    top_k=5
)

config = MemoryConfig(
    reranker={
        "provider": "sentence_transformer",
        "config": st_config.model_dump()
    }
)

mem = Memory(config)

```

## Performing Reranked Searches

Once configured, reranking operates transparently during search operations. The `rerank` parameter defaults to `True`, but you can disable it for latency-sensitive queries where approximate vector similarity is sufficient.

```python

# Add memories to the store

mem.add([{"role": "user", "content": "I love playing soccer on weekends"}])
mem.add([{"role": "user", "content": "My favorite food is Italian pasta"}])

# Search with reranking enabled (default behavior)

results = mem.search(
    query="What are the user's hobbies?",
    user_id="user123",
    rerank=True,  # Explicitly enable; omit to use default

    limit=10
)

# Access reranked scores

for item in results["results"]:
    print(f"Memory: {item['memory']}")
    print(f"Rerank Score: {item.get('rerank_score', 'N/A')}")
    print("---")

```

The `limit` parameter applies to both the initial vector search and the reranking stage. The reranker receives the full candidate set up to `limit`, then returns the top-k most relevant items based on the provider's scoring model.

## Error Handling and Resilience

Rerankers in Mem0 implement graceful degradation. If the external API times out, the API key is invalid, or the local model fails to load, the reranker catches the exception and returns the original unranked results with a `rerank_score` of `0.0` for each document. This design ensures that memory retrieval remains functional even when auxiliary reranking services are unavailable.

## Summary

- **Rerankers in Mem0** are configured via `MemoryConfig` using provider-specific config classes imported from `mem0.configs.rerankers`.
- The `RerankerFactory` in [`mem0/utils/factory.py`](https://github.com/mem0ai/mem0/blob/main/mem0/utils/factory.py) instantiates the correct implementation based on the `provider` string.
- Reranking occurs automatically in `Memory.search()` (defined in [`mem0/memory/main.py`](https://github.com/mem0ai/mem0/blob/main/mem0/memory/main.py)) when `rerank=True` and a reranker is configured.
- Supported providers include **Cohere**, **Sentence-Transformers**, **HuggingFace**, **Zero-Entropy**, and **LLM-based** rerankers, each requiring different dependencies and API keys.
- The system includes built-in fallback logic that preserves search functionality if reranking services fail.

## Frequently Asked Questions

### How do I disable reranking for specific queries?

Pass `rerank=False` to the `Memory.search()` method. By default, Mem0 attempts to rerank results whenever a reranker is configured, but you can override this per-query to reduce latency for non-critical retrievals.

### Can I use multiple rerankers simultaneously?

No, the current architecture in [`mem0/configs/base.py`](https://github.com/mem0ai/mem0/blob/main/mem0/configs/base.py) supports a single `reranker` configuration per `Memory` instance. To compare different providers, initialize separate `Memory` clients with distinct configurations and run A/B tests on your dataset.

### What happens if my Cohere API key expires during a search?

The `CohereReranker` implementation catches authentication and network errors during the `rerank()` call. It logs the failure and returns the original vector search results with a default score of `0.0`, ensuring your application continues operating without interruption.

### Do I need a GPU for local rerankers?

Not necessarily. The **sentence_transformer** and **huggingface** providers can run on CPU, though GPU acceleration significantly improves latency for large document sets. Configure the `device` parameter in your provider-specific config (where supported) to control hardware utilization.