How the Transformation LangGraph Workflow Applies Custom Content Processing
The Transformation LangGraph workflow executes user-defined prompt templates by assembling custom instructions with source content, routing the payload through an Esperanto-based LLM provider, cleaning the raw response, and persisting the output as a structured insight.
The lfnovo/open-notebook repository uses a dedicated LangGraph pipeline to turn static source material into structured notebook content. When a user triggers a transformation from the UI or API, the Transformation LangGraph workflow orchestrates the full lifecycle of custom content processing, from prompt assembly to insight persistence. The core logic lives in open_notebook/graphs/transformation.py, where a compiled single-node graph delivers a repeatable pipeline that can target any source text.
State Preparation and Prompt Assembly in the Transformation LangGraph Workflow
The workflow begins by populating a TransformationState object with the raw material and the user's custom instructions. According to [open_notebook/graphs/transformation.py](open_notebook/graphs/transformation.py#L16-L22), the state carries input_text, an optional Source object, and a Transformation record that contains the prompt template.
If the transformation is configured to use system defaults, the workflow prepends the instruction from the DefaultPrompts record before appending the user's custom template. The assembled prompt is then suffixed with a literal # INPUT section and rendered with current state values—including source metadata and the input text—inside the run_transformation function at lines 30–40.
LLM Provisioning and Execution
Once the prompt is fully rendered, the workflow wraps it in a LangChain payload consisting of a SystemMessage and a HumanMessage. The graph delegates model selection to the Esperanto-based model provider via provision_langchain_model, as implemented in [open_notebook/graphs/transformation.py](open_notebook/graphs/transformation.py#L44-L50).
The provider selects the specific model based on the model_id supplied in the RunnableConfig and the "transformation" task type. It also requests a large context window of up to 8192 tokens to accommodate long-form source content. The resulting chain is executed asynchronously using chain.ainvoke.
Response Cleaning and Output Processing
After the LLM returns a response, the run_transformation function at lines 54–57 performs a two-stage cleaning process on response.content.
extract_text_content– Strips non-text artifacts such as JSON wrappers from the raw LLM output.clean_thinking_content– Removes internal reasoning traces, including"Thought:"blocks, that the model may emit.
This cleaning ensures that only the processed textual content advances to the persistence layer.
Persisting Results as Source Insights
If the state includes a Source object, the cleaned output is attached to that source through the source.add_insight method. As shown in [open_notebook/graphs/transformation.py](open_notebook/graphs/transformation.py#L58-L60), this action creates a new note inside the linked notebook, making the transformation result available to downstream features such as search and chat.
The insight remains permanently associated with the original source material, enabling fully traceable custom content processing across the notebook.
Graph Architecture and Compilation of the Transformation LangGraph Workflow
The entire pipeline is encapsulated as a single-node LangGraph. In [open_notebook/graphs/transformation.py](open_notebook/graphs/transformation.py#L71-L75), the graph is defined with one agent node connected by a straight-line edge from START to agent to END.
This minimalist architecture makes the Transformation LangGraph workflow predictable and easy to debug while still supporting asynchronous, large-context LLM calls through LangChain.
Running Transformations from Python and TypeScript
You can invoke the compiled graph directly from Python or call the API from the frontend. Both paths ultimately reach the same run_transformation node described in the sections above.
The following Python example loads a Source, defines a custom Transformation, and executes the graph with a specific model ID:
from open_notebook.graphs.transformation import graph
from open_notebook.domain.transformation import Transformation
from open_notebook.domain.notebook import Source
from langgraph.graph import RunnableConfig
# Example custom transformation definition
my_transform = Transformation(
name="my_summary",
title="Summary",
description="Generate a concise summary",
prompt="Summarize the following text in 150 words.",
apply_default=False,
)
# Assume `source` is a Source object already loaded from the DB
state = {
"input_text": "", # empty → use source.full_text
"source": source,
"transformation": my_transform,
}
config = RunnableConfig(configurable={"model_id": "gpt-4o-mini"})
# Execute the graph
result = await graph.ainvoke(state, config=config)
print(result["output"])
From the TypeScript frontend, the same backend logic is reached through the transformations API:
import { transformationsApi } from '@/lib/api/transformations'
await transformationsApi.execute({
transformation_id: '12345',
input_text: 'Long article …',
model_id: 'gpt-4o-mini',
})
Summary
-
The Transformation LangGraph workflow is defined in
open_notebook/graphs/transformation.pyas a single-node graph with anagentnode running fromSTARTtoEND. -
State preparation combines
input_text, aSourceobject, and aTransformationrecord inside aTransformationStateobject. -
Prompt assembly optionally prepends default instructions, appends the custom template, and adds a literal
# INPUTsection before rendering. -
The Esperanto-based model provider (
provision_langchain_model) handles LLM selection and requests up to 8192 tokens for the"transformation"task type. -
Raw LLM responses are cleaned by
extract_text_contentandclean_thinking_contentto remove artifacts and reasoning traces. -
Final outputs are persisted as insights via
source.add_insight, linking structured results back to the original notebook source.
Frequently Asked Questions
What is the Transformation LangGraph workflow in Open Notebook?
The Transformation LangGraph workflow is the compiled LangGraph pipeline in lfnovo/open-notebook that executes user-defined transformations on source content. It is implemented in open_notebook/graphs/transformation.py and consists of a single agent node that prepares state, assembles prompts, calls an LLM, cleans the response, and persists the result as a source insight.
How does the workflow handle custom prompt templates?
The workflow retrieves the custom prompt from the Transformation record in the graph state. If apply_default is enabled, it prepends the system DefaultPrompts instruction before appending the user template and a literal # INPUT section, as seen in the run_transformation function at lines 30–40 of open_notebook/graphs/transformation.py.
What model provider does the transformation graph use?
The graph delegates LLM calls to an Esperanto-based provider through the provision_langchain_model function. This provider selects the model using the model_id from the RunnableConfig and the "transformation" task type, requesting a context window of up to 8192 tokens to handle long inputs.
How are transformation results stored in Open Notebook?
After the response is cleaned, the workflow checks for a Source object in the state. If present, it calls source.add_insight to attach the cleaned output to the original source, creating a new note inside the linked notebook that is accessible to search and chat features.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →