Mem0 Self-Hosted SDK vs Hosted Platform API: Architecture and Implementation Guide
The Mem0 SDK connects to either a fully managed SaaS at api.mem0.ai (hosted) or your own OpenMemory FastAPI server (self-hosted), with identical code interfaces but different operational responsibilities around infrastructure, authentication, and data control.
The mem0ai/mem0 repository provides a unified Python SDK that abstracts memory operations for AI applications. Whether you choose the self-hosted SDK approach using the open-source OpenMemory server or the hosted Mem0 platform API, the client interface in mem0/client/main.py remains consistent—only the deployment model and operational overhead differ significantly.
Architectural Overview
| Aspect | Hosted Mem0 Platform API | Self-Hosted SDK + OpenMemory Server |
|---|---|---|
| Service Location | Fully managed SaaS (https://api.mem0.ai). The service runs in Mem0’s cloud, is multi-tenant and globally distributed. |
Runs on your own infrastructure (local machine, VM, container, Kubernetes, etc.) via the OpenMemory FastAPI server (openmemory/api/main.py). |
| Deployment & Maintenance | No infrastructure to provision. Mem0 handles scaling, upgrades, security patches, backups and monitoring automatically. | You must provision the server, keep the code up-to-date, manage the database (sqlite by default, configurable), handle backups, TLS, scaling, and any required OS-level updates. |
| Authentication | Requires a MEM0_API_KEY that is validated against the hosted service. The key also ties usage to your organization for billing and telemetry. | The OpenMemory server can be started without authentication (or with a simple token you configure). The SDK can be pointed at the server by passing host="http://localhost:8000" and any dummy api_key. |
| Telemetry & Analytics | Built-in usage dashboards, request logging, error reporting, and usage-based billing are provided out of the box. | No built-in telemetry; you would need to instrument the server yourself if you want similar analytics. |
| Reliability & SLA | Mem0 guarantees uptime, latency SLAs and redundancy across regions. | Availability depends on your own deployment and monitoring strategy. |
| Feature Parity | All latest Mem0 features (e.g., vector-store plugins, advanced rerankers, multi-app ACLs) are released first on the hosted service. | The open-source repository includes the same core SDK logic, but some enterprise-only features (e.g., advanced access-control, usage-based throttling) may be omitted or require manual implementation. |
| Cost | Pay-as-you-go based on API calls and stored vectors (billing handled by Mem0). | No usage fees from Mem0, but you incur infrastructure costs (compute, storage, network). |
| Data Residency & Compliance | Data lives in Mem0’s cloud regions; compliance is covered by Mem0’s certifications. | You have full control over where data is stored (e.g., on-premise, specific cloud region), which can simplify compliance requirements. |
Authentication and Configuration
The MemoryClient class in mem0/client/main.py serves as a thin wrapper around HTTP calls. It builds request payloads, adds the Authorization: Token <api_key> and Mem0-User-ID headers, and forwards the request to the configured host.
By default, the SDK targets the hosted platform:
# From mem0/client/main.py - MemoryClient.__init__
self.host = "https://api.mem0.ai"
self.api_key = os.environ.get("MEM0_API_KEY")
To switch to a self-hosted instance, override the host parameter and provide any value for api_key (the local OpenMemory server may not validate the token unless configured to do so):
client = MemoryClient(
host="http://localhost:8000",
api_key="dummy-key" # Satisfies header requirement without validation
)
Code Examples
Using the Hosted Mem0 Platform
When using the managed service, the SDK automatically reads your API key from the environment and connects to Mem0’s cloud infrastructure:
from mem0 import MemoryClient
# Automatically picks up MEM0_API_KEY from environment
client = MemoryClient()
# Add a memory to the hosted service
response = client.add(
messages=[
{"role": "user", "content": "I love hiking in the Alps."}
],
user_id="[email protected]"
)
print("Memory ID:", response["id"])
Key implementation details:
MemoryClient.__init__setsself.host = "https://api.mem0.ai"and readsMEM0_API_KEYfrom the environment.- Source:
mem0/client/main.py(lines 39-66).
Running the Self-Hosted OpenMemory Server
The self-hosted option requires running the OpenMemory FastAPI server locally or on your infrastructure. The server exposes the same REST endpoints as the hosted platform.
Start the server using Docker (as documented in the repository):
docker run -p 8000:8000 mem0ai/openmemory:latest
Now configure the SDK to point to your local instance:
from mem0 import MemoryClient
# Point to local server; dummy API key satisfies header requirements
client = MemoryClient(
host="http://localhost:8000",
api_key="local-dev-key"
)
# Identical call signature to hosted version
response = client.add(
messages=[{"role": "user", "content": "My favorite coffee is espresso."}],
user_id="[email protected]"
)
print("Local memory ID:", response["id"])
Key implementation details:
- OpenMemory entry point:
openmemory/api/main.py(FastAPI app initialization). - Default database: SQLite (
sqlite:///./openmemory.db) configured inopenmemory/api/app/database.py. - Memory routes:
openmemory/api/app/routers/memories.pyimplements the same endpoints as the hosted service.
Switching Between Environments at Runtime
You can abstract the client initialization to switch between hosted and self-hosted modes without changing your application logic:
from mem0 import MemoryClient
def get_memory_client(mode: str = "hosted"):
if mode == "hosted":
# Uses MEM0_API_KEY env var and https://api.mem0.ai
return MemoryClient()
else:
# Connects to local OpenMemory server
return MemoryClient(
host="http://localhost:8000",
api_key="dev-key"
)
# Usage
client = get_memory_client(mode="self-hosted")
Key Files and Implementation Details
Understanding the source structure helps clarify why the SDK works identically across both deployment models:
| File | Purpose | Direct Link |
|---|---|---|
mem0/client/main.py |
Core SDK – builds HTTP requests, handles Authorization headers, exposes MemoryClient methods (add, get, search, delete). |
https://github.com/mem0ai/mem0/blob/main/mem0/client/main.py |
openmemory/api/main.py |
FastAPI entry point for the self-hosted OpenMemory service. Initializes the database and registers API routers. | https://github.com/mem0ai/mem0/blob/main/openmemory/api/main.py |
openmemory/api/app/database.py |
Database configuration. Defaults to SQLite (sqlite:///./openmemory.db) but supports PostgreSQL via environment variables. |
https://github.com/mem0ai/mem0/blob/main/openmemory/api/app/database.py |
openmemory/api/app/routers/memories.py |
Implements the REST endpoints (POST /v1/memories/, GET /v1/memories/, etc.) that the SDK consumes. |
https://github.com/mem0ai/mem0/blob/main/openmemory/api/app/routers/memories.py |
Summary
- The hosted Mem0 platform provides a fully managed, multi-tenant SaaS with automatic scaling, built-in analytics, usage-based billing, and enterprise SLAs, requiring only a
MEM0_API_KEYto use. - The self-hosted OpenMemory server runs on your infrastructure via Docker or direct deployment, giving you complete data residency control and zero SaaS fees, but requiring you to manage the database (
sqliteby default), TLS, backups, and scaling. - The SDK interface is identical in both scenarios; only the
hostparameter and API key requirements change when initializingMemoryClientfrommem0/client/main.py.
Frequently Asked Questions
Can I switch between hosted and self-hosted without changing my application code?
Yes. The MemoryClient class accepts a host parameter that defaults to https://api.mem0.ai. By parameterizing this value—passing http://localhost:8000 for self-hosted deployments—you can use the same add(), get(), and search() methods without modifying your application logic. Only the initialization configuration changes.
What database does the self-hosted version use?
By default, the OpenMemory server uses SQLite (sqlite:///./openmemory.db) as configured in openmemory/api/app/database.py. However, you can override this via environment variables to use PostgreSQL or other SQL databases supported by SQLAlchemy. The hosted platform uses Mem0’s proprietary vector store infrastructure.
Is there feature parity between the hosted and self-hosted versions?
The core memory operations—adding, retrieving, searching, and deleting memories—are identical in both environments because they share the same SDK and endpoint specifications. However, enterprise features such as advanced access-control lists (ACLs), usage-based throttling, managed vector-store plugins, and built-in analytics dashboards are typically available only on the hosted platform or require manual implementation in self-hosted deployments.
Do I need an API key for the self-hosted version?
Not necessarily. The OpenMemory server can run without authentication, though you may configure a simple token for basic security. The SDK requires an api_key parameter to satisfy the Authorization: Token <api_key> header format, but you can pass any dummy value (e.g., "local-dev-key") when connecting to a local instance without validation enabled.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →