How Open Notebook Handles Async Podcast Generation with SurrealDB Job Queues

Open Notebook uses the surreal-commands library to queue podcast generation tasks in SurrealDB, returning a job ID immediately while background workers process the LLM and TTS operations asynchronously.

The lfnovo/open-notebook repository implements a robust async podcast generation pipeline that offloads heavy text-to-speech and language model operations from the main API thread. By leveraging SurrealDB as the persistence layer and the surreal-commands library for task orchestration, the system ensures non-blocking HTTP responses while managing complex audio synthesis workflows in the background.

Job Submission Flow

When a client initiates podcast creation, the system immediately persists the request as a command record rather than processing it synchronously.

The REST endpoint POST /podcasts/generate receives the request and delegates to PodcastService.submit_generation_job in api/podcast_service.py (lines 45-99). This method validates the episode profile and speaker profile, resolves the source content from either a notebook or direct text input, and then registers a new command via submit_command from the surreal-commands library.

The SurrealDB record ID returned by submit_command becomes the job ID (e.g., command:001abcdef) that clients poll for status updates. This architectural choice ensures the HTTP response returns within milliseconds, even when the underlying podcast generation requires minutes of processing time.

POST /podcasts/generate
Content-Type: application/json

{
  "episode_profile": "TechTalk",
  "speaker_profile": "DefaultSpeaker",
  "episode_name": "AI Trends 2024",
  "notebook_id": "notebook:12345"
}

The immediate response includes the job identifier:

{
  "job_id": "command:001abcdef",
  "status": "submitted",
  "message": "Podcast generation started for episode 'AI Trends 2024'",
  "episode_profile": "TechTalk",
  "episode_name": "AI Trends 2024"
}

Background Command Execution

The actual audio synthesis work is performed by a decorated function that the surreal-commands worker picks up from the queue.

In commands/podcast_commands.py (lines 69-85), the function generate_podcast_command is decorated with @command("generate_podcast", app="open_notebook"). When the job is dequeued, this command receives a PodcastGenerationInput model containing the validated parameters. The function loads the relevant episode and speaker profiles, resolves all language-model configurations, and creates a UUID-based output directory for the generated assets.

Finally, the command invokes the third-party podcast-creator library to synthesize the audio, transcript, and outline. Because this runs inside the surreal-commands worker process, the main Open Notebook API remains available to handle other requests during generation.

Job Status Tracking and Polling

Clients monitor long-running operations through a dedicated status endpoint that queries the SurrealDB command record.

The REST endpoint GET /podcasts/jobs/{job_id} handles polling requests by calling PodcastService.get_job_status in api/podcast_service.py (lines 15-33). This service method invokes get_command_status from surreal-commands to retrieve the current state from SurrealDB, which tracks statuses including pending, running, completed, and failed.

GET /podcasts/jobs/command:001abcdef

A typical running response includes progress metadata:

{
  "job_id": "command:001abcdef",
  "status": "running",
  "result": null,
  "error_message": null,
  "created": "2026-06-05T12:34:56Z",
  "updated": "2026-06-05T12:35:10Z",
  "progress": 0.45
}

When the status transitions to completed, the result field contains the generated episode_id, which clients can then fetch via GET /podcasts/episodes/{episode_id}.

Result Persistence and Retrieval

After successful generation, the system links the output files to the original job record for auditability and retry logic.

The generate_podcast_command function creates a PodcastEpisode record in SurrealDB, storing the audio file path, transcript, and outline. It uses ensure_record_id to link this episode to the original command ID, enabling the system to retrieve the episode alongside its job status in list endpoints.

If a job fails, the POST /podcasts/episodes/{episode_id}/retry endpoint deletes the broken episode record, removes partial audio files, and submits a fresh job using the same profiles and content—maintaining the async queue pattern throughout the retry lifecycle.

Summary

  • Immediate Response: The POST /podcasts/generate endpoint uses PodcastService.submit_generation_job to create a SurrealDB command record instantly, returning a job_id while processing continues in the background.
  • Worker Processing: The @command("generate_podcast") decorator in commands/podcast_commands.py marks the function that surreal-commands workers execute, handling LLM calls and TTS via the podcast-creator library.
  • Status Monitoring: Clients poll GET /podcasts/jobs/{job_id} which queries SurrealDB through get_command_status, receiving real-time states like running or completed with optional progress indicators.
  • Data Integrity: The system persists results as PodcastEpisode records linked to their originating command IDs, supporting retrieval and retry operations without blocking the main API thread.

Frequently Asked Questions

How does Open Notebook queue podcast generation jobs?

When the POST /podcasts/generate endpoint receives a request, it calls PodcastService.submit_generation_job in api/podcast_service.py. This method validates the input and invokes submit_command from the surreal-commands library, which creates a command record in SurrealDB. The resulting record ID serves as the job identifier for asynchronous tracking.

What is the role of surreal-commands in the podcast pipeline?

The surreal-commands library provides the infrastructure for async job queue management in Open Notebook. It offers submit_command for enqueueing work, get_command_status for polling, and the @command decorator that registers Python functions as executable units for background workers. This allows the podcast generation logic to run independently of the FastAPI request handlers.

How can I check the status of a running podcast generation job?

Query the GET /podcasts/jobs/{job_id} endpoint, which internally calls PodcastService.get_job_status. This method uses get_command_status to fetch the current state from SurrealDB, returning JSON with fields like status, progress, created, and updated timestamps. When status reaches completed, the result field contains the episode ID.

What happens when a podcast generation job fails?

Failed jobs retain their failed status in SurrealDB with error details in the error_message field. The system supports recovery through the POST /podcasts/episodes/{episode_id}/retry endpoint, which removes the orphaned episode record, deletes partial audio files, and submits a new command using the original parameters—generating a fresh job ID for the retry attempt.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →