How to Configure Episode and Speaker Profiles for Multi-Speaker Podcasts in Open Notebook
Open Notebook stores reusable podcast templates in EpisodeProfile and voice definitions in SpeakerProfile, resolving both at generation time through open_notebook/podcasts/models.py to automate multi-speaker synthesis.
Configuring episode and speaker profiles for multi-speaker podcasts in the lfnovo/open-notebook repository lets you build reusable templates instead of hard-coding details into every generation request. The system relies on two Pydantic-based data models defined in open_notebook/podcasts/models.py that enforce validation rules and resolve model references at runtime.
Overview: Configure Episode and Speaker Profiles for Multi-Speaker Podcasts
EpisodeProfile
The EpisodeProfile class represents a reusable podcast template. It stores a human-readable name, an optional description, the target number of segments, and references to both an associated SpeakerProfile (speaker_config) and the LLM records used for outline and transcript generation (outline_llm and transcript_llm). Validation in open_notebook/podcasts/models.py enforces that num_segments must be between 3 and 20 (source lines 82–86).
SpeakerProfile
The SpeakerProfile class describes the voices used inside an episode. Each profile can hold 1 to 4 speakers, and every speaker dictionary must contain the mandatory keys name, voice_id, backstory, and personality. The model also supports an optional shared TTS model via voice_model, plus legacy fields (tts_provider, tts_model) for backward compatibility. A dedicated validator in the source file enforces both the speaker count and required fields (source lines 60–68).
Create a Multi-Speaker Voice Profile
Use the speaker profile REST API defined in api/routers/speaker_profiles.py to register a group of voices.
import requests
profile = {
"name": "TechTalk",
"description": "Four-person tech round-table",
"voice_model": "model://tts/openai/whisper-v2",
"speakers": [
{
"name": "Host",
"voice_id": "voice_01",
"backstory": "Seasoned tech journalist",
"personality": "Curious, friendly"
},
{
"name": "Engineer",
"voice_id": "voice_02",
"backstory": "Senior software engineer",
"personality": "Analytical, dry"
},
{
"name": "Designer",
"voice_id": "voice_03",
"backstory": "UX designer with a flair for storytelling",
"personality": "Creative, enthusiastic"
},
{
"name": "Investor",
"voice_id": "voice_04",
"backstory": "Venture capitalist",
"personality": "Strategic, confident"
}
]
}
resp = requests.post("http://localhost:5055/speaker-profiles", json=profile)
print(resp.json())
Each entry in the speakers list requires name, voice_id, backstory, and personality exactly as validated in open_notebook/podcasts/models.py.
Create a Reusable Episode Template
Episode profiles live in api/routers/episode_profiles.py. Link your speaker profile by its name and declare the LLM model IDs for the outline and transcript stages.
import requests
episode = {
"name": "AI Futures",
"description": "Exploring upcoming AI trends",
"speaker_config": "TechTalk",
"outline_llm": "model://llm/openai/gpt-4o",
"transcript_llm": "model://llm/openai/gpt-4o-mini",
"language": "en-US",
"default_briefing": "Discuss the impact of AI on society in 5 minutes.",
"num_segments": 5
}
resp = requests.post("http://localhost:5055/episode-profiles", json=episode)
print(resp.json())
The num_segments value must fit within the 3–20 range enforced by the model validator.
Trigger Podcast Generation from Configured Profiles
Once your profiles are stored, call the generation endpoint in api/routers/podcasts.py to start multi-speaker synthesis.
import requests
payload = {
"episode_name": "AI Futures – Episode 1",
"episode_profile": "AI Futures"
}
resp = requests.post("http://localhost:5055/podcasts/generate", json=payload)
print(resp.json())
Behind the scenes, the backend performs three resolution steps before queuing the job:
- Profile lookup —
EpisodeProfile.get_by_namefetches the template, andSpeakerProfile.get_by_nameloads the linked voice configuration. - LLM resolution —
resolve_outline_configandresolve_transcript_configon the episode profile convert stored model IDs into concrete provider-model-config tuples through_resolve_model_config. - TTS resolution —
SpeakerProfile.resolve_tts_configreturns the correct TTS provider, model, and credentials for each speaker, including any per-speaker overrides.
commands/podcast_commands.py then orchestrates the workflow by building safe output directories, loading resolved profiles, and creating PodcastEpisode records before handing control to the podcast-creator library.
Migrate from Legacy String-Based Configs
If you are upgrading an existing installation, use open_notebook/podcasts/migration.py to convert old string-based model and provider references into the current record-ID schema. The migration script ensures that historical tts_provider and tts_model strings are safely remapped so that legacy speaker profiles remain compatible with the new resolution pipeline (source line 66).
Summary
EpisodeProfilestores reusable episode templates withnum_segmentsconstrained between 3 and 20.SpeakerProfiledefines 1 to 4 speakers, each requiringname,voice_id,backstory, andpersonality.- The backend resolves abstract model IDs into concrete configs via
resolve_outline_config,resolve_transcript_config, andresolve_tts_configinsideopen_notebook/podcasts/models.py. - REST endpoints in
api/routers/speaker_profiles.py,api/routers/episode_profiles.py, andapi/routers/podcasts.pyexpose the full CRUD and generation lifecycle. commands/podcast_commands.pyorchestrates the final synthesis, whileopen_notebook/podcasts/migration.pyhandles legacy upgrades.
Frequently Asked Questions
How many speakers can a single SpeakerProfile contain?
A SpeakerProfile can define 1 to 4 speakers. The Pydantic validator in open_notebook/podcasts/models.py (source lines 60–68) rejects any profile that falls outside this range or omits the mandatory fields name, voice_id, backstory, or personality.
What validation rules apply to the number of segments in an episode?
The num_segments field on EpisodeProfile must be between 3 and 20 inclusive. This constraint is enforced directly in open_notebook/podcasts/models.py (source lines 82–86) during model validation.
How does Open Notebook resolve stored model IDs into usable configurations at runtime?
When a generation request arrives, the backend calls resolve_outline_config and resolve_transcript_config on the episode profile, which internally use _resolve_model_config to transform model record IDs into provider-model-config tuples. For voices, SpeakerProfile.resolve_tts_config performs the same resolution for each speaker’s TTS settings.
What happens to legacy string-based TTS configurations after an upgrade?
The migration script in open_notebook/podcasts/migration.py converts legacy string-based tts_provider and tts_model references into the modern record-ID schema. This ensures existing speaker profiles remain functional without manual intervention (source line 66).
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →