How to Implement the ask_user Tool with Custom Callbacks for Human-in-the-Loop Workflows

The ask_user tool enables PageAgentCore to pause execution and request human input by invoking an async callback assigned to agent.onAskUser, which can be customized to use any UI implementation from browser prompts to modal dialogs.

The alibaba/page-agent framework provides a flexible human-in-the-loop mechanism through the ask_user tool, allowing AI agents to request clarification during automated tasks. By implementing a custom callback on the PageAgentCore instance, developers can integrate any user interface—from simple browser prompts to complex modal systems—without modifying the core agent logic.

Understanding the ask_user Architecture

The implementation relies on three key components working together to isolate UI concerns from core execution logic.

During agent initialization in PageAgentCore.execute(), the system checks for this.onAskUser. If the callback exists, the ask_user tool remains registered; otherwise, the tool is explicitly removed via this.tools.delete('ask_user') to ensure the agent runs fully automated.

Step-by-Step Implementation

Configure the Callback on PageAgentCore

First, instantiate PageAgentCore and assign an async function to onAskUser. This callback receives the question string and must return a Promise that resolves with the user's answer.

import { PageAgentCore } from '@page-agent/core'
import { PageController } from '@page-agent/page-controller'

const pageController = new PageController({ enableMask: true })
const agent = new PageAgentCore({
  pageController,
  baseURL: 'https://api.openai.com/v1',
  apiKey: 'your-api-key',
  model: 'gpt-4o-mini',
})

// Custom human-in-the-loop handler
agent.onAskUser = async (question) => {
  // Return user input as a string
  return window.prompt(question) ?? ''
}

await agent.execute('Fill the registration form with test data')

This example from packages/website/src/pages/docs/advanced/custom-ui/page.tsx uses window.prompt, but you can replace this with any async operation, such as fetching data from a chat widget or modal dialog.

Tool Registration and Execution Flow

The ask_user tool validates the callback presence before execution. If onAskUser is undefined when the tool runs, it throws an error to prevent unhandled halts in automated workflows.

tools.set(
  'ask_user',
  tool({
    description:
      'Ask the user a question and wait for their answer. Use this if you need more information or clarification.',
    inputSchema: z.object({
      question: z.string(),
    }),
    execute: async function (this: PageAgentCore, input) {
      if (!this.onAskUser) {
        throw new Error('ask_user tool requires onAskUser callback to be set')
      }
      const answer = await this.onAskUser(input.question)
      return `User answered: ${answer}`
    },
  })
)

When the LLM emits a tool call like {"name": "ask_user", "input": {"question": "What email address should be used?"}}, the tool invokes await this.onAskUser(input.question) and formats the response as User answered: ${answer} for the LLM to continue reasoning.

Built-in UI Panel Integration

For applications using the default UI components, the Panel class in packages/ui/src/panel/Panel.ts wires itself to the agent by assigning agent.onAskUser to its private #askUser method.

class Panel {
  constructor(agent) {
    // Wire the agent to the panel's private ask method
    this.#agent.onAskUser = (question) => this.#askUser(question)
  }

  // Render a temporary card, show an input box, and resolve when the user replies
  #askUser(question: string): Promise<string> {
    return new Promise((resolve) => {
      this.#isWaitingForUserAnswer = true
      this.#userAnswerResolver = resolve
      // UI renders question card and input box here
    })
  }
}

This approach renders a temporary question card within the agent's interface, waits for the user submission, and resolves the promise to return control to the agent.

Advanced Customization Patterns

Beyond simple prompts, the callback can interface with external services or complex UI frameworks. Because onAskUser expects any function matching (question: string) => Promise<string>, you can implement:

  • React Modal Components: Return a promise that resolves when a modal form submits.
  • Slack/Teams Integrations: Post the question to a channel and await the response via webhook.
  • Voice Interfaces: Convert text-to-speech, listen for user input, and transcribe the answer.

The core agent remains agnostic to these implementations, maintaining clean separation between automation logic and user interface layers.

Summary

Frequently Asked Questions

What happens if I don't set the onAskUser callback?

If agent.onAskUser remains undefined, the ask_user tool is automatically removed from the tool registry during execute() via this.tools.delete('ask_user'). This prevents the LLM from attempting to invoke a human interaction that cannot be fulfilled, ensuring the agent operates in fully automated mode.

Can I use ask_user with a remote API instead of a local UI?

Yes. The onAskUser callback accepts any async function returning a Promise<string>. You can implement API calls to external services, chat platforms, or ticketing systems within this callback. The agent will pause execution at await this.onAskUser(input.question) until your remote service resolves with the user's answer.

How do I customize the question rendering beyond the default panel?

Replace the built-in Panel integration by not assigning agent.onAskUser to the panel's method. Instead, provide your own implementation that renders questions using your preferred framework (React, Vue, vanilla JS) and resolves when the user submits input. The documentation example in packages/website/src/pages/docs/advanced/custom-ui/page.tsx demonstrates this with a simple window.prompt replacement.

Does the ask_user tool support validation or multiple input fields?

The current implementation in packages/core/src/tools/index.ts accepts a single question string parameter and expects a single string answer. For complex forms or validation, implement this logic inside your custom onAskUser callback—collect the necessary data through your UI, validate it, and return the serialized result as a string for the LLM to process.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →