internals

How MarkItDown Handles Nested Conversions Within ZIP Files: Recursive Extraction Explained

April 11, 2026 microsoft/markitdown ↗

MarkItDown recursively processes nested ZIP files by treating each archive entry as a new stream that passes back through the full converter pipeline, enabling unlimited nesting depth without special-case handling.

The microsoft/markitdown library treats ZIP archives as traversable containers rather than static blobs. When performing nested conversions within ZIP files, the built-in ZipConverter extracts each entry and feeds it back into the core engine via convert_stream(), automatically supporting arbitrarily deep hierarchies of archives within archives.

ZIP Detection and Acceptance

Before extraction begins, `

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/markitdown works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →