How to Configure Custom LLM Prompts for Image Descriptions in MarkItDown
Set the llm_prompt parameter when initializing MarkItDown or calling convert() to override the default "Write a detailed caption for this image" instruction sent to multimodal LLMs.
MarkItDown, Microsoft's open-source document conversion library, generates image captions using multimodal large language models when you provide an llm_client and llm_model. According to the microsoft/markitdown source code, you can fully customize the text prompt used for these descriptions at global, per-call, or per-converter scope without
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
Share the following with your agent to get started:
curl -s "https://instagit.com/install.md" Works with
Claude
Codex
Cursor
VS Code
OpenClaw
Any MCP Client
Maintain an open-source project? Get it listed too →