# How Pyrefly Handles Import Resolution and Module Finding

> Discover how Pyrefly handles import resolution and module finding with its hierarchical search algorithm. Learn about directory caching, stub package handling, and legacy namespace support.

- Repository: [Meta/pyrefly](https://github.com/facebook/pyrefly)
- Tags: internals
- Published: 2026-05-21

---

**Pyrefly resolves imports through a hierarchical search algorithm implemented in [`pyrefly/lib/module/finder.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/finder.rs) that mirrors Python's import machinery while adding directory caching, stub package handling, and legacy namespace package support.**

Import resolution forms the foundation of static type checking, mapping module names to filesystem paths before type analysis begins. In the `facebook/pyrefly` repository, this critical functionality is centralized in the `pyrefly::module::finder` module, which reimplements Python's import semantics with optimizations for performance and stub compatibility. Understanding Pyrefly's import resolution reveals how the tool navigates complex package layouts, from namespace packages to third-party type stubs.

## The Import Resolution Pipeline

### Entry Point: `find_import_internal`

The public API for import resolution is `find_import_internal` (lines 1219‑1234 in [`pyrefly/lib/module/finder.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/finder.rs)). This function accepts a `ConfigFile` containing search paths, the target `ModuleName` to resolve, an optional `origin` path indicating which file performs the import, a `style_filter` for preferring `.pyi` over `.py` files, and a `DirEntryCache` for filesystem operations. It returns a `FindingOrError<ModulePath>` (lines 108‑124) that either contains the resolved filesystem path or a detailed error indicating why resolution failed.

### Hierarchical Search Order

Pyrefly queries module sources in a strict sequence defined by the configuration:

- Build-system-provided search paths
- Source-database cache (if configured)
- User-specified `search_path` directories
- Custom `typeshed` path (optional)
- Bundled standard-library `typeshed`
- Heuristic fallback search paths
- Site-package paths including third-party stubs
- Extra file extension handlers
- Implicit namespace package fallbacks

This ordering ensures that user configuration overrides system defaults while maintaining compatibility with Python's module search semantics.

### Component Resolution Chain

For each search location, the helper `find_module` (lines 786‑863) decomposes the target `ModuleName` into its first component and remaining submodules. It checks for a `-stubs` variant of the initial component (e.g., `foo-stubs` for package `foo`), then delegates to `find_module_components` (lines 692‑755). The final filesystem checks occur in `find_one_part_in_root` (lines 334‑425), which distinguishes between regular packages (directories with [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py)), single-file modules (`.py` or `.pyi`), and compiled extensions (`.pyc`, `.pyd`).

## Module Discovery and Filesystem Interaction

### Directory Caching with `DirEntryCache`

All filesystem operations flow through `DirEntryCache` (lines 441‑518), which stores directory listings in a `HashMap<PathBuf, Option<Arc<SmallMap<OsString, bool>>>>`. This eliminates redundant `stat` and `readdir` calls, reducing existence checks to O(1) lookups. The cache proves essential for large repositories or remote filesystems like EdenFS, where repeated directory traversals would create significant I/O overhead.

### Handling Namespace Packages

When multiple search roots contain directories with the same package name, `NamespaceAccumulator` (lines 403‑447) merges these paths to emulate Python's `__path__` extension semantics. Pyrefly distinguishes between two namespace variants:

- **Legacy namespace packages**: Detected by `is_pkgutil_namespace` (lines 96‑122), which reads the first 4KiB of [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py) and runs a regex to identify `pkgutil.extend_path` usage. The first discovered [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py) wins, but all directories contribute to the package path.
- **Implicit namespace packages**: Directories without an [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py) file (PEP 420), where all matching directories across roots are accumulated into a single logical package.

### Stub Package Resolution

The resolver prioritizes type information through `resolve_third_party_stub` (lines 657‑692). It first searches for packages with a `-stubs` suffix (e.g., `requests-stubs`), then falls back to third-party typeshed stubs via [`typeshed_third_party.rs`](https://github.com/facebook/pyrefly/blob/main/typeshed_third_party.rs), and finally checks the bundled standard-library typeshed. The `FindResult::best_result` method (lines 889‑906) enforces precedence rules: regular packages outrank legacy namespaces, which outrank single-file modules, which outrank compiled extensions.

### Extra Extension Handling

For non-standard filename patterns where dots act as module separators (e.g., `a.b.c.cinc`), `find_extra_extension_module` (lines 1006‑1060) provides specialized handling that maps these dotted filenames back to module path components.

## Resolving Imports in Practice

The following Rust example demonstrates how to programmatically resolve a module import using Pyrefly's public API:

```rust
use pyrefly::config::config::ConfigFile;
use pyrefly::module::finder::find_import_internal;
use pyrefly_python::module_name::ModuleName;
use pyrefly::state::state::TransactionTimingCounters;

// Load a configuration (e.g., from pyproject.toml)
let config = ConfigFile::load_default().expect("load config");

// The module we want to resolve, e.g., `numpy.linalg`
let target = ModuleName::from_str("numpy.linalg");

// Optional: the file that contains the import statement
let origin = None;

// We prefer interface (`.pyi`) files but fall back to implementation (`.py`)
let style_filter = None;

// Collect phantom-path diagnostics (optional)
let mut phantom_paths = None;

// Directory cache to speed up repeated lookups
let dir_cache = pyrefly::module::finder::DirEntryCache::new(true);

// Timing counters for profiling (optional)
let timing = None;

// Resolve the import
let result = find_import_internal(
    &config,
    target,
    origin,
    style_filter,
    &mut phantom_paths,
    &dir_cache,
    timing,
);

match result {
    pyrefly::state::loader::FindingOrError::Ok(path) => {
        println!("Found module at {}", path.display());
    }
    pyrefly::state::loader::FindingOrError::Error(err) => {
        eprintln!("Import resolution failed: {:?}", err);
    }
}

```

This snippet demonstrates the typical usage path: load a `ConfigFile`, construct a `ModuleName`, and invoke `find_import_internal`. The function internally traverses the hierarchical search order and returns either a concrete filesystem path or a `FindError` indicating issues like `Ignored` or `MissingSource`.

## Key Source Files and Architecture

| File | Role |
|------|------|
| **[`pyrefly/lib/module/finder.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/finder.rs)** | Core import-resolution engine implementing search logic, caching, and namespace handling |
| **[`pyrefly/lib/module/typeshed.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/typeshed.rs)** | Provides bundled standard-library stub lookup |
| **[`pyrefly/lib/module/third_party.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/third_party.rs)** | Handles third-party stub discovery via `get_bundled_third_party` |
| **[`pyrefly/lib/module/typeshed_third_party.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/typeshed_third_party.rs)** | Looks up third-party stubs shipped with Pyrefly |
| **[`pyrefly/lib/module/parse.rs`](https://github.com/facebook/pyrefly/blob/main/pyrefly/lib/module/parse.rs)** | Parses discovered files into ASTs after resolution |

## Summary

- **Pyrefly's import resolution** centers on `find_import_internal` in [`finder.rs`](https://github.com/facebook/pyrefly/blob/main/finder.rs), implementing Python's import semantics with a configurable hierarchical search path.
- **Performance optimization** comes from `DirEntryCache` (lines 441‑518), which minimizes filesystem operations through in-memory caching of directory listings.
- **Namespace package support** uses `NamespaceAccumulator` (lines 403‑447) to handle both legacy `pkgutil.extend_path` packages and PEP 420 implicit namespace packages.
- **Stub prioritization** follows a strict order: `-stubs` packages, third-party typeshed, bundled typeshed, then implementation files, managed by `resolve_third_party_stub`.
- **Precedence rules** in `FindResult::best_result` (lines 889‑906) ensure regular packages outrank legacy namespaces, which outrank single-file modules and compiled extensions.

## Frequently Asked Questions

### How does Pyrefly handle namespace packages differently from standard Python?

Pyrefly's `NamespaceAccumulator` (lines 403‑447) explicitly tracks multiple directory roots containing the same package name, building either a `LegacyNamespacePackage` (when `is_pkgutil_namespace` detects `pkgutil.extend_path` in the first 4KiB of [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py)) or an `ImplicitNamespacePackage` (no [`__init__.py`](https://github.com/facebook/pyrefly/blob/main/__init__.py) present). This matches Python's `__path__` extension semantics while maintaining compatibility with both legacy and PEP 420 namespace packages.

### What is the performance impact of Pyrefly's directory caching?

The `DirEntryCache` (lines 441‑518) stores directory entries in a `HashMap<PathBuf, Option<Arc<SmallMap<OsString, bool>>>>`, reducing `stat` and `readdir` operations to O(1) lookups. This optimization is critical for large repositories or remote filesystems like EdenFS, where repeated directory traversals would create significant overhead during import resolution.

### How does Pyrefly prioritize between `.py` and `.pyi` stub files?

The `style_filter` parameter in `find_import_internal` allows explicit preference for interface (`.pyi`) or implementation (`.py`) files. When resolving third-party modules, the system first checks for packages with a `-stubs` suffix, then consults bundled typeshed definitions via `resolve_third_party_stub` (lines 657‑692) before falling back to standard module discovery.

### Where does Pyrefly search for modules during resolution?

Pyrefly follows a strict hierarchical order defined in `find_import_internal` (lines 1219‑1234): build-system paths, source-database cache, user `search_path`, custom typeshed, bundled standard-library typeshed, heuristic paths, site-packages (including third-party stubs), extra extensions, and implicit namespace fallbacks. This sequence ensures compatibility with Python's module search semantics while allowing configuration overrides.