Model Weight Loading in Transformers: Safetensors vs Legacy PyTorch .bin Formats

Question

Explore safetensors vs legacy PyTorch bin formats for model weight loading in Hugging Face Transformers. Learn about zero-copy memory mapping and fallback mechanisms.

Accepted Answer

Transformers loads model weights from files by default using zero-copy memory mapping via , but falls back to legacy checkpoints via with PyTorch 2.6+ safety guards when safetensors files are absent or when is set. When you instantiate a model using in the Hugging Face Transformers library, the framework executes a complex model weight loading pipeline that must securely deserialize billions of parameters from disk. According to the source code, the library prioritizes the Safetensors format for its security and memory efficiency, while maintaining backward compatibility with legacy PyTorch checkpoints through conditional fallback logic in and . How Transformers Detects Checkpoint File Formats The model weight loading process begins with file detection logic that scans the checkpoint directory for specific filename constants defined in . File Detection Logic in trainer.py In , the method and checkpoint resume logic implement the primary detection branch. The code checks for (defined as at lines 263‑264 of ) before considering legacy alternatives. If evaluates to , the loader immediately selects the Safetensors path. Otherwise, it falls back to checking for or . The prefer safe Flag The detection logic respects a boolean parameter that defaults to throughout the codebase. When is passed to or , the library bypasses Safetensors files even if they exist, forcing model weight loading through the legacy pathway. Loading Mechanisms: Zero-Copy vs Pickle Deserialization Once the format is detected, the library invokes fundamentally different deserialization mechanisms that impact security, memory usage, and speed. Safetensors Zero-Copy Loading For files, Transformers calls . This implementation performs a zero-copy memory mapping operation that reads tensor data directly from disk without executing arbitrary code or creating unnecessary memory copies. The Safetensors format stores only raw tensor buffers and metadata, eliminating the Python pickle deserialization attack surface entirely. This path requires no version checks or safety guards because the file format itself is strictly limited to numerical data. PyTorch .bin with Safety Guards For legacy checkpoints, the library uses . However, before executing this call, Transformers runs from (lines 63‑71). This safety function enforces PyTorch version ≥ 2.6 due to CVE‑2025‑32434, a critical vulnerability in Python's pickle module that affects . If the installed PyTorch version is older, the function raises a preventing potentially unsafe model weight loading. Security and Performance Implications The divergence between these two model weight loading pathways has significant operational consequences for ML pipelines. CVE-2025-32434 and torch.load Restrictions The requirement for PyTorch 2.6+ when loading files stems from a pickle deserialization vulnerability tracked as CVE‑2025‑32434. The guard in ensures that users cannot accidentally execute malicious code embedded in legacy checkpoint files on vulnerable PyTorch versions. Safetensors checkpoints are immune to this vulnerability because they bypass Python's pickle mechanism entirely, using a custom binary format that only stores tensor shapes, dtypes, and raw byte buffers. Memory Efficiency Benefits Safetensors provides memory-mapped file loading , allowing the operating system to load tensor pages on demand rather than copying the entire checkpoint into RAM before transferring to GPU. This reduces peak memory consumption during model weight loading, particularly for large models like LLMs where checkpoints may exceed 100GB. The legacy format requires full deserialization into Python objects before the state dict can be applied to the model, consuming additional memory and CPU cycles during the pickle unpickling process. Practical Code Examples Loading with Default Safetensors Preference When executing this code, Transformers checks for ( ) in the cache directory and invokes if found. Forcing Legacy .bin Format This triggers the fallback branch in and , requiring PyTorch 2.6+ to pass the validation. Manual Safetensors Loading This demonstrates the zero-copy loading mechanism that underlies the automatic pipeline. Safe Legacy Loading with Version Check This mirrors the internal safety logic that protects against CVE‑2025‑32434. Summary - Default Behavior : Transformers automatically prefers Safetensors ( ) for model weight loading, using for zero-copy, memory-mapped access. - Security Model : Safetensors eliminates pickle deserialization vulnerabilities entirely, while legacy files require PyTorch 2.6+ and to mitigate CVE‑2025‑32434 via . - File Detection : The library checks for constants in and implements the selection logic in and . - Backward Compatibility : Setting forces legacy loading, and the library maintains support for both formats in sharded and non-sharded checkpoints. Frequently Asked Questions What is the default format for model weight loading in Transformers? The default format is Safetensors (

Model Weight Loading in Transformers: Safetensors vs Legacy PyTorch .bin Formats

How Transformers Detects Checkpoint File Formats

File Detection Logic in trainer.py

The prefer_safe Flag

Loading Mechanisms: Zero-Copy vs Pickle Deserialization

Safetensors Zero-Copy Loading

PyTorch .bin with Safety Guards

Security and Performance Implications

CVE-2025-32434 and torch.load Restrictions

Memory Efficiency Benefits

Practical Code Examples

Loading with Default Safetensors Preference

Forcing Legacy .bin Format

Manual Safetensors Loading

Safe Legacy Loading with Version Check

Summary

Frequently Asked Questions

What is the default format for model weight loading in Transformers?

Why does Transformers require PyTorch 2.6 for .bin files?

Can I convert existing .bin checkpoints to safetensors?

How does sharding work with safetensors vs .bin formats?

Have a question about this repo?