Which LightningStore Backend to Choose: MongoDB, SQLite, or In-Memory
Choose MongoDB for production multi-node training requiring durable persistence, select In-Memory for fast unit tests and prototyping, and avoid SQLite as it remains unimplemented with only a TODO placeholder in the current codebase.
LightningStore serves as the persistence layer in the microsoft/agent-lightning repository, synchronizing rollouts, attempts, spans, and resources across distributed trainers and runners. The framework provides three backend options—MongoDB, SQLite, and In-Memory—but only two are currently functional according to the source code in agentlightning/store/.
LightningStore Backend Overview
The LightningStore abstract base class in agentlightning/store/base.py defines the core CRUD API, including methods like enqueue_rollout(), dequeue_rollout(), and add_resources(). All concrete backends inherit from this base and implement the CollectionBasedLightningStore helper from agentlightning/store/collection_based.py, which handles collection-centric logic for rollouts, attempts, and workers.
The three available backends differ in persistence guarantees and concurrency support:
- MongoDB: Durable disk storage with full thread and async safety, designed for production clusters.
- SQLite: Local file persistence intended for lightweight single-process use, but currently only a stub.
- In-Memory: Volatile Python dictionary storage with configurable thread safety, ideal for testing.
MongoDB Backend for Production Workloads
The MongoLightningStore class in agentlightning/store/mongo.py provides the only production-ready persistent backend. It connects via AsyncMongoClient to a MongoDB instance or replica set, storing data in collections that survive process restarts and remain accessible across multiple machines.
When to Choose MongoDB
Select this backend when your workload requires:
- Multi-node training coordination across distributed processes.
- Strong consistency guarantees for concurrent readers and writers.
- Built-in indexing and query capabilities for filtering rollouts by status or pagination.
- Data durability that persists beyond Python process lifetimes.
Configuration Example
from agentlightning.store import MongoLightningStore
store = MongoLightningStore(
mongo_uri="mongodb://my-mongo-host:27017/?replicaSet=rs0",
database_name="my_agentlightning_db",
partition_id="my-partition",
)
await store.enqueue_rollout(input={"prompt": "Hello"}, mode="train")
attempt = await store.dequeue_rollout()
print(attempt.rollout_id)
In-Memory Backend for Testing and Prototyping
The InMemoryLightningStore class in agentlightning/store/memory.py stores all data in Python dictionaries, providing maximum speed for unit tests and interactive development. Data exists only for the duration of the process and offers configurable memory thresholds to prevent excessive RAM consumption.
When to Choose In-Memory
This backend suits scenarios where:
- Running fast unit tests that require a fresh store for each test case.
- Prototyping in Jupyter notebooks where persistence between sessions is unnecessary.
- Explicitly controlling memory usage through
eviction_memory_thresholdandsafe_memory_thresholdparameters.
Thread Safety Configuration
The In-Memory store supports cross-thread safety via the thread_safe parameter, though it does not support cross-process sharing like MongoDB.
from agentlightning.store import InMemoryLightningStore
store = InMemoryLightningStore(thread_safe=True)
await store.add_resources({"llm": LLM(resource_type="llm", endpoint="http://localhost:8000")})
rollout = await store.start_rollout(input={"task": "demo"})
print(rollout.rollout_id)
SQLite Backend Current Limitations
The SQLiteLightningStore class in agentlightning/store/sqlite.py exists only as a stub containing a TODO comment. Attempting to instantiate this backend raises NotImplementedError or similar errors until the implementation is completed. While SQLite would theoretically suit single-process scripts requiring lightweight disk persistence without a MongoDB server, the current repository offers no functional SQLite support.
Summary
- MongoDB is the only production-grade backend currently implemented in
agentlightning/store/mongo.py, supporting distributed training and durable persistence. - In-Memory provides fast, volatile storage for testing and notebooks via
agentlightning/store/memory.py, with optional thread safety. - SQLite remains unimplemented as of the current codebase, containing only a placeholder file at
agentlightning/store/sqlite.py. - All backends inherit from the abstract
LightningStoreclass and share collection-based logic throughCollectionBasedLightningStore.
Frequently Asked Questions
Is the SQLite backend functional in Agent-Lightning?
No. The file agentlightning/store/sqlite.py contains only a TODO placeholder comment and lacks implementation. If you attempt to import SQLiteLightningStore, you will encounter a NotImplementedError or find that the class is not fully defined.
Can the In-Memory backend handle concurrent access?
Yes, but only within a single process. By setting thread_safe=True when constructing InMemoryLightningStore, you enable a locking mechanism that ensures thread safety across multiple threads. However, unlike MongoDB, it cannot share state between separate processes or machines.
How do I configure MongoDB for a distributed training cluster?
Pass a MongoDB connection URI specifying your replica set or cluster nodes to the mongo_uri parameter of MongoLightningStore. For example: mongodb://host1:27017,host2:27017/?replicaSet=rs0. The backend handles async connection pooling via AsyncMongoClient automatically.
What data persists when using the In-Memory LightningStore?
Nothing persists beyond the Python process lifetime. All rollouts, attempts, and resources stored in an InMemoryLightningStore instance are held in Python dictionaries and garbage collected when the process exits, making this backend unsuitable for production training that must resume after restarts.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →