how-to-guide

How to Configure Custom LLM Prompts for Image Descriptions in MarkItDown

April 11, 2026 microsoft/markitdown ↗

Set the llm_prompt parameter when initializing MarkItDown or calling convert() to override the default "Write a detailed caption for this image" instruction sent to multimodal LLMs.

MarkItDown, Microsoft's open-source document conversion library, generates image captions using multimodal large language models when you provide an llm_client and llm_model. According to the microsoft/markitdown source code, you can fully customize the text prompt used for these descriptions at global, per-call, or per-converter scope without

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/markitdown works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →