How Transformers Handles Model Hub Caching and Offline Loading

When you call from_pretrained, the library first checks the local cache at ~/.cache/huggingface/hub; if the file exists it returns immediately, otherwise it downloads from the Hub unless offline mode is active, in which case it raises a clear error or falls back to cached files.

The Hugging Face Transformers library implements a robust two-tier system for model hub caching and offline loading that ensures reproducible, air-gapped deployments. Every call to AutoModel.from_pretrained, AutoTokenizer, or cached_file ultimately routes through src/transformers/utils/hub.py, where the library decides whether to read from disk or fetch from the network.

Cache Architecture and Directory Layout

The cache root is determined by environment variables and constants defined in the codebase. By default, files live in ~/.cache/huggingface/hub, controlled by constants.HF_HUB_CACHE and overridable via HF_HOME or HF_MODULES_CACHE.

In src/transformers/utils/hub.py (lines 87‑93), the library resolves the cache directory:


# From hub.py L87-L93

cache_dir = os.environ.get("HF_HOME", os.path.expanduser("~/.cache/huggingface"))
cache_dir = os.path.join(cache_dir, "hub")

The on-disk layout mirrors the Hub repository structure:


~/.cache/huggingface/hub/
├─ models--<repo_id>/
│   ├─ snapshots/
│   │   └─ <revision_hash>/
│   │       ├─ pytorch_model.bin
│   │       ├─ config.json
│   │       └─ …
└─ <repo_type>/

This design allows try_to_load_from_cache (hub.py lines 100‑108) to locate files using only the repo id, filename, and revision hash.

The Caching Pipeline

Every file fetch follows a strict resolution order implemented in cached_files (src/transformers/utils/hub.py, lines 10‑33 and 47‑108):

  1. Local directory check – If path_or_repo_id points to a local path, it is used directly (lines 70‑86).
  2. Cache lookuptry_to_load_from_cache checks whether the exact revision already exists in snapshots/ (lines 94‑110).
  3. Offline guardis_offline_mode() (line 361‑364) forces local_files_only=True when HF_HUB_OFFLINE or TRANSFORMERS_OFFLINE is set, skipping any network call.
  4. Network download – If allowed, hf_hub_download (single file) or snapshot_download (full repo) writes directly into the cache folder (lines 18‑28).
  5. Return resolved path – The final list of local paths is returned to the caller.

Error handling (lines 121‑128) produces descriptive OSError messages when a file is missing in offline mode, telling users exactly which file is required and how to pre-download it.

Offline Mode Implementation

Offline behavior is controlled by two environment variables checked in is_offline_mode():

  • HF_HUB_OFFLINE – Modern, preferred switch.
  • TRANSFORMERS_OFFLINE – Legacy alias for backward compatibility.

Both accept truthy values ("1", "true", "yes"). When active, the library automatically sets local_files_only=True in cached_files (hub.py lines 361‑364), ensuring that hf_hub_download is never invoked.

You can also trigger local-only behavior per-call without setting environment variables:

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "google-bert/bert-base-uncased",
    local_files_only=True,
)

Practical Usage Examples

Loading a model with automatic caching

from transformers import AutoModel

# First call downloads to ~/.cache/huggingface/hub

model = AutoModel.from_pretrained("google-bert/bert-base-uncased")

# Subsequent calls read from disk instantly

model = AutoModel.from_pretrained("google-bert/bert-base-uncased")

Enabling global offline mode

export HF_HUB_OFFLINE=1

# or

export TRANSFORMERS_OFFLINE=1
from transformers import AutoTokenizer

# Raises OSError if files are not already cached

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

Inspecting cache contents programmatically

from huggingface_hub import snapshot_download

# Returns the local path inside the cache

cache_path = snapshot_download(
    "google-bert/bert-base-uncased",
    local_files_only=True,
)
print(cache_path)

# Output: /home/user/.cache/huggingface/hub/models--google-bert--bert-base-uncased/snapshots/<hash>

Using cached_file directly

from transformers.utils.hub import cached_file

config_path = cached_file(
    "google-bert/bert-base-uncased",
    "config.json",
    revision="main",
)
print(config_path)  # None if not cached and not in offline mode

Key Source Files

File Role
src/transformers/utils/hub.py Central implementation of cached_files, cached_file, try_to_load_from_cache, and offline-mode guards.
src/transformers/modeling_utils.py Uses is_offline_mode to protect weight-loading logic in from_pretrained.
src/transformers/tokenization_utils_tokenizers.py Applies offline guards when fetching tokenizer files.
src/transformers/processing_utils.py Implements offline checks for image and feature-extraction assets.
src/transformers/video_processing_utils.py Shows offline guards for video-related processing files.
tests/utils/test_offline.py Test suite verifying offline behavior and environment-variable handling.

Summary

  • The Transformers library stores all downloaded artifacts in ~/.cache/huggingface/hub (configurable via HF_HOME), using a structured layout that maps repo IDs to snapshot folders.
  • Offline mode is triggered by HF_HUB_OFFLINE or TRANSFORMERS_OFFLINE, forcing local_files_only=True and preventing any HTTP requests.
  • The cached_files function in src/transformers/utils/hub.py orchestrates the resolution order: local path → cache lookup → (optional) download → error handling.
  • Users can force cache-only behavior per-call with local_files_only=True, or inspect cache contents using huggingface_hub.snapshot_download.

Frequently Asked Questions

How do I completely disable network access in Transformers?

Set the environment variable HF_HUB_OFFLINE=1 (or the legacy TRANSFORMERS_OFFLINE=1) before importing the library. This forces every from_pretrained call to read only from the local cache at ~/.cache/huggingface/hub and raises a clear error if files are missing.

Where does Transformers store downloaded models?

By default, files live in ~/.cache/huggingface/hub. You can change this by setting the HF_HOME environment variable; the library appends hub to that path. The internal layout uses models--<repo_id>/snapshots/<revision_hash>/ to keep different versions isolated.

What happens if a file is missing while in offline mode?

The library raises an OSError with a descriptive message indicating exactly which file is required and suggesting that you download it while online first. This error originates in the exception handling block of cached_files in src/transformers/utils/hub.py (lines 121‑128).

Can I use a specific revision or commit hash when loading offline?

Yes. When you specify revision="abc123" in from_pretrained, the cache lookup in try_to_load_from_cache searches for a snapshot folder matching that exact hash. As long as you previously downloaded that revision while online, offline loading will succeed.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →