# How MarkItDown Uses Magika for File Type Detection: Implementation Deep Dive

> Learn how MarkItDown implements Magika for advanced file type detection. Discover how ML predictions merge with filename data to create enriched StreamInfo objects guiding converter selection.

- Repository: [Microsoft/markitdown](https://github.com/microsoft/markitdown)
- Tags: internals
- Published: 2026-04-11

---

**MarkItDown creates a `magika.Magika()` instance at initialization and calls `identify_stream()` on every input file to generate ML-based content predictions that are merged with filename-based metadata, producing enriched `StreamInfo` objects that guide converter selection.**

Microsoft’s MarkItDown leverages **Magika**, Google’s fast ML-based file-type detector, to enhance document conversion accuracy beyond simple extension checking. When you process a file through MarkItDown, the tool uses Magika to inspect raw bytes and confirm or correct the file type before selecting the appropriate converter