How speaker_profiles.py Enables Multi-Speaker Podcast Generation in Open Notebook
The speaker_profiles.py router exposes REST endpoints that manage SpeakerProfile records, validating speaker configurations, serializing TTS model references, and resolving voice settings to feed distinct voice data into the podcast synthesis pipeline.
Open Notebook transforms text content into AI-generated audio narratives. The api/routers/speaker_profiles.py module serves as the primary gateway for defining multi-speaker scenarios, working in tandem with the underlying SpeakerProfile data model to orchestrate voices, personalities, and text-to-speech (TTS) provider configurations that power dynamic, multi-character podcasts.
What Is speaker_profiles.py?
speaker_profiles.py is a FastAPI router located at api/routers/speaker_profiles.py that provides RESTful CRUD operations for speaker configuration records. While the router handles HTTP requests and responses, it relies on the SpeakerProfile class defined in open_notebook/podcasts/models.py (lines 26-34) to enforce data integrity and business logic. This separation allows the API layer to remain thin while the model layer handles complex validation, SurrealDB serialization, and TTS resolution.
When you create a new profile through the router, the endpoint instantiates a SpeakerProfile object, which validates that every speaker entry contains the mandatory fields: name, voice_id, backstory, and personality (validation logic at lines 58-68).
The SpeakerProfile Data Model
The SpeakerProfile class acts as the central schema for multi-speaker definitions. It stores:
- Profile metadata: A unique
nameand optionaldescription - Default TTS model: A
voice_modelfield referencing a TTS provider record (e.g.,model:tts/openai/tts-1) - Speaker array: A list of speaker objects, each with distinct voice characteristics and personality traits
Before persisting to SurrealDB, the _prepare_save_data method (lines 71-80) converts any voice_model references into proper RecordID objects. This serialization step ensures that database relationships remain intact and queryable, handling both the profile-level default model and any per-speaker overrides.
Resolving TTS Configurations at Runtime
When a podcast generation command executes, the system must translate stored profile references into concrete TTS provider credentials. The SpeakerProfile.resolve_tts_config method (lines 82-89) performs this resolution by:
- Loading the referenced TTS model record from the database
- Extracting the provider name (e.g., "openai", "elevenlabs")
- Retrieving credential configuration for authentication
- Returning a tuple of
(provider, model_name, config)
This resolution occurs both at the profile level (for default settings) and per-speaker when individual voice overrides exist, ensuring each character can utilize a distinct TTS engine if desired.
Creating Multi-Speaker Configurations via the API
The router exposes a POST /speaker-profiles endpoint that accepts JSON payloads defining complete speaker rosters. Each speaker requires a voice_id matching your TTS provider's available voices.
# POST /speaker-profiles
{
"name": "InterviewShow",
"description": "Host + Guest format",
"voice_model": "model:tts/openai/tts-1",
"speakers": [
{
"name": "Host",
"voice_id": "en-US-Standard-A",
"backstory": "Professional podcast host with broadcasting experience.",
"personality": "Friendly, energetic, inquisitive"
},
{
"name": "Guest",
"voice_id": "en-GB-Standard-B",
"backstory": "AI researcher and published author.",
"personality": "Calm, analytical, precise"
}
]
}
The router stores this configuration (see implementation at lines 12-20), making it available for future podcast episodes via the profile name.
Integrating Profiles into the Podcast Workflow
When generating a multi-speaker episode, the generate_podcast_command in commands/podcast_commands.py orchestrates the profile retrieval and TTS resolution:
- Load the profile: Retrieves the
SpeakerProfileusingawait SpeakerProfile.get_by_name()(lines 84-98) - Validate TTS availability: Confirms the profile provides a valid
voice_model(lines 113-119) - Resolve configurations: Calls
await speaker_profile.resolve_tts_config()to obtain provider credentials (lines 124-128) - Handle per-speaker overrides: Iterates through individual speakers to resolve any specific
voice_modeloverrides (lines 95-108) - Inject into creator: Passes the resolved configuration to
configure("speakers_config", ...)(lines 31-35), which thepodcast-creatorlibrary uses to assign voices to dialogue segments
This workflow ensures that when the podcast generator processes a script, it knows exactly which TTS provider and voice ID to use for each character, creating seamless multi-speaker audio output.
Retrieving and Managing Existing Profiles
You can fetch existing configurations using the GET /speaker-profiles/{name} endpoint (implementation at lines 35-43), which returns the stored speaker roster with resolved references ready for client-side display or editing.
GET /speaker-profiles/InterviewShow
{
"name": "InterviewShow",
"description": "Host + Guest format",
"voice_model": "model:tts/openai/tts-1",
"speakers": [
{
"name": "Host",
"voice_id": "en-US-Standard-A",
"backstory": "Professional podcast host with broadcasting experience.",
"personality": "Friendly, energetic, inquisitive"
}
]
}
Summary
speaker_profiles.pyprovides the REST API interface for creating, reading, and managing speaker configurations in Open Notebook.SpeakerProfile(inopen_notebook/podcasts/models.py) validates required speaker fields (name, voice_id, backstory, personality) and serializes TTS model references for SurrealDB storage._prepare_save_dataensures database-ready RecordID conversion before persistence, maintaining referential integrity with TTS model records.resolve_tts_configtranslates stored model references into runtime provider credentials (provider, model_name, configuration) needed for audio synthesis.generate_podcast_commandleverages these profiles to configure thepodcast-creatorlibrary with distinct voice settings for each speaker, enabling true multi-character audio generation.
Frequently Asked Questions
What fields are required when defining a speaker in speaker_profiles.py?
Each speaker object must include four required fields: name (character identifier), voice_id (provider-specific voice identifier), backstory (context for content generation), and personality (behavioral traits affecting dialogue style). The validation logic in open_notebook/podcasts/models.py (lines 58-68) enforces these requirements during profile creation.
How does speaker_profiles.py handle different TTS providers for each speaker?
The SpeakerProfile model supports a default voice_model at the profile level, but individual speakers can override this with their own voice_model field. During podcast generation, the system calls resolve_tts_config for each override, loading the specific provider credentials and model name, allowing one speaker to use OpenAI while another uses ElevenLabs within the same episode.
Where is the speaker profile data actually validated and stored?
While api/routers/speaker_profiles.py receives HTTP requests, the SpeakerProfile class in open_notebook/podcasts/models.py handles validation and defines the schema. The _prepare_save_data method (lines 71-80) ensures proper serialization before the data is persisted to SurrealDB, converting string model references into proper RecordID objects for database relationships.
Can I update a speaker profile after creating episodes with it?
Yes, the router exposes update endpoints that modify the stored SpeakerProfile record. Subsequent podcast generations referencing that profile name will use the updated configuration. However, previously generated episodes retain the voice settings resolved at their time of creation, as the TTS configuration is captured during the command execution phase (lines 124-128 in commands/podcast_commands.py).
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →