How to Implement the ask_user Tool with Custom Callbacks for Human-in-the-Loop Workflows
The ask_user tool enables PageAgentCore to pause execution and request human input by invoking an async callback assigned to agent.onAskUser, which can be customized to use any UI implementation from browser prompts to modal dialogs.
The alibaba/page-agent framework provides a flexible human-in-the-loop mechanism through the ask_user tool, allowing AI agents to request clarification during automated tasks. By implementing a custom callback on the PageAgentCore instance, developers can integrate any user interface—from simple browser prompts to complex modal systems—without modifying the core agent logic.
Understanding the ask_user Architecture
The implementation relies on three key components working together to isolate UI concerns from core execution logic.
PageAgentCore.onAskUser: An optional property inpackages/core/src/PageAgentCore.tsthat holds the async callback. When undefined, the tool is automatically removed from the agent's toolset to prevent automation failures.- Tool Definition: Located in
packages/core/src/tools/index.ts, this implements the validation logic and invokes the callback when the LLM requests human input. - UI Integration: The
Panelclass inpackages/ui/src/panel/Panel.tsprovides a reference implementation, whilepackages/website/src/pages/docs/advanced/custom-ui/page.tsxdemonstrates custom implementations.
During agent initialization in PageAgentCore.execute(), the system checks for this.onAskUser. If the callback exists, the ask_user tool remains registered; otherwise, the tool is explicitly removed via this.tools.delete('ask_user') to ensure the agent runs fully automated.
Step-by-Step Implementation
Configure the Callback on PageAgentCore
First, instantiate PageAgentCore and assign an async function to onAskUser. This callback receives the question string and must return a Promise that resolves with the user's answer.
import { PageAgentCore } from '@page-agent/core'
import { PageController } from '@page-agent/page-controller'
const pageController = new PageController({ enableMask: true })
const agent = new PageAgentCore({
pageController,
baseURL: 'https://api.openai.com/v1',
apiKey: 'your-api-key',
model: 'gpt-4o-mini',
})
// Custom human-in-the-loop handler
agent.onAskUser = async (question) => {
// Return user input as a string
return window.prompt(question) ?? ''
}
await agent.execute('Fill the registration form with test data')
This example from packages/website/src/pages/docs/advanced/custom-ui/page.tsx uses window.prompt, but you can replace this with any async operation, such as fetching data from a chat widget or modal dialog.
Tool Registration and Execution Flow
The ask_user tool validates the callback presence before execution. If onAskUser is undefined when the tool runs, it throws an error to prevent unhandled halts in automated workflows.
tools.set(
'ask_user',
tool({
description:
'Ask the user a question and wait for their answer. Use this if you need more information or clarification.',
inputSchema: z.object({
question: z.string(),
}),
execute: async function (this: PageAgentCore, input) {
if (!this.onAskUser) {
throw new Error('ask_user tool requires onAskUser callback to be set')
}
const answer = await this.onAskUser(input.question)
return `User answered: ${answer}`
},
})
)
When the LLM emits a tool call like {"name": "ask_user", "input": {"question": "What email address should be used?"}}, the tool invokes await this.onAskUser(input.question) and formats the response as User answered: ${answer} for the LLM to continue reasoning.
Built-in UI Panel Integration
For applications using the default UI components, the Panel class in packages/ui/src/panel/Panel.ts wires itself to the agent by assigning agent.onAskUser to its private #askUser method.
class Panel {
constructor(agent) {
// Wire the agent to the panel's private ask method
this.#agent.onAskUser = (question) => this.#askUser(question)
}
// Render a temporary card, show an input box, and resolve when the user replies
#askUser(question: string): Promise<string> {
return new Promise((resolve) => {
this.#isWaitingForUserAnswer = true
this.#userAnswerResolver = resolve
// UI renders question card and input box here
})
}
}
This approach renders a temporary question card within the agent's interface, waits for the user submission, and resolves the promise to return control to the agent.
Advanced Customization Patterns
Beyond simple prompts, the callback can interface with external services or complex UI frameworks. Because onAskUser expects any function matching (question: string) => Promise<string>, you can implement:
- React Modal Components: Return a promise that resolves when a modal form submits.
- Slack/Teams Integrations: Post the question to a channel and await the response via webhook.
- Voice Interfaces: Convert text-to-speech, listen for user input, and transcribe the answer.
The core agent remains agnostic to these implementations, maintaining clean separation between automation logic and user interface layers.
Summary
- The ask_user tool is automatically enabled when
agent.onAskUseris defined, and removed when absent to prevent workflow stalls. - Implement the callback in
packages/core/src/PageAgentCore.tsby assigning an async function that returns a string. - The tool definition in
packages/core/src/tools/index.tsvalidates callback presence and formats responses asUser answered: ${answer}. - Use the built-in
Panelclass frompackages/ui/src/panel/Panel.tsfor immediate UI capabilities, or provide a custom callback as shown inpackages/website/src/pages/docs/advanced/custom-ui/page.tsx. - This architecture supports any async user interaction pattern while keeping the core agent framework UI-agnostic.
Frequently Asked Questions
What happens if I don't set the onAskUser callback?
If agent.onAskUser remains undefined, the ask_user tool is automatically removed from the tool registry during execute() via this.tools.delete('ask_user'). This prevents the LLM from attempting to invoke a human interaction that cannot be fulfilled, ensuring the agent operates in fully automated mode.
Can I use ask_user with a remote API instead of a local UI?
Yes. The onAskUser callback accepts any async function returning a Promise<string>. You can implement API calls to external services, chat platforms, or ticketing systems within this callback. The agent will pause execution at await this.onAskUser(input.question) until your remote service resolves with the user's answer.
How do I customize the question rendering beyond the default panel?
Replace the built-in Panel integration by not assigning agent.onAskUser to the panel's method. Instead, provide your own implementation that renders questions using your preferred framework (React, Vue, vanilla JS) and resolves when the user submits input. The documentation example in packages/website/src/pages/docs/advanced/custom-ui/page.tsx demonstrates this with a simple window.prompt replacement.
Does the ask_user tool support validation or multiple input fields?
The current implementation in packages/core/src/tools/index.ts accepts a single question string parameter and expects a single string answer. For complex forms or validation, implement this logic inside your custom onAskUser callback—collect the necessary data through your UI, validate it, and return the serialized result as a string for the LLM to process.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →