# Model Weight Loading in Transformers: Safetensors vs Legacy PyTorch .bin Formats

> Explore safetensors vs legacy PyTorch bin formats for model weight loading in Hugging Face Transformers. Learn about zero-copy memory mapping and fallback mechanisms.

- Repository: [Hugging Face/transformers](https://github.com/huggingface/transformers)
- Tags: deep-dive
- Published: 2026-02-21

---

**Transformers loads model weights from `.safetensors` files by default using zero-copy memory mapping via `safetensors.torch.load_file()`, but falls back to legacy `.bin` checkpoints via `torch.load()` with PyTorch 2.6+ safety guards when safetensors files are absent or when `prefer_safe=False` is set.**

When you instantiate a model using `from_pretrained()` in the Hugging Face Transformers library, the framework executes a complex model weight loading pipeline that must securely deserialize billions of parameters from disk. According to the `huggingface/transformers` source code, the library prioritizes the **Safetensors** format for its security and memory efficiency, while maintaining backward compatibility with legacy **PyTorch `.bin`** checkpoints through conditional fallback logic in [`src/transformers/trainer.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py) and [`src/transformers/core_model_loading.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/core_model_loading.py).

## How Transformers Detects Checkpoint File Formats

The model weight loading process begins with file detection logic that scans the checkpoint directory for specific filename constants defined in [`src/transformers/utils/__init__.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/__init__.py).

### File Detection Logic in trainer.py

In [`src/transformers/trainer.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py), the `_load_best_model` method and checkpoint resume logic implement the primary detection branch. The code checks for `SAFE_WEIGHTS_NAME` (defined as `"model.safetensors"` at lines 263‑264 of [`utils/__init__.py`](https://github.com/huggingface/transformers/blob/main/utils/__init__.py)) before considering legacy alternatives.

If `os.path.isfile(safe_weights_file)` evaluates to `True`, the loader immediately selects the Safetensors path. Otherwise, it falls back to checking for `pytorch_model.bin` or `adapter_model.bin`.

### The prefer_safe Flag

The detection logic respects a boolean `prefer_safe` parameter that defaults to `True` throughout the codebase. When `prefer_safe=False` is passed to `from_pretrained()` or `Trainer`, the library bypasses Safetensors files even if they exist, forcing model weight loading through the legacy `.bin` pathway.

## Loading Mechanisms: Zero-Copy vs Pickle Deserialization

Once the format is detected, the library invokes fundamentally different deserialization mechanisms that impact security, memory usage, and speed.

### Safetensors Zero-Copy Loading

For `.safetensors` files, Transformers calls `safetensors.torch.load_file(<path>, device="cpu")`. This implementation performs a **zero-copy** memory mapping operation that reads tensor data directly from disk without executing arbitrary code or creating unnecessary memory copies.

The Safetensors format stores only raw tensor buffers and metadata, eliminating the Python pickle deserialization attack surface entirely. This path requires no version checks or safety guards because the file format itself is strictly limited to numerical data.

### PyTorch .bin with Safety Guards

For legacy `.bin` checkpoints, the library uses `torch.load(<path>, map_location="cpu", weights_only=True)`. However, before executing this call, Transformers runs `check_torch_load_is_safe()` from [`src/transformers/utils/import_utils.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/import_utils.py) (lines 63‑71).

This safety function enforces **PyTorch version ≥ 2.6** due to CVE‑2025‑32434, a critical vulnerability in Python's pickle module that affects `torch.load`. If the installed PyTorch version is older, the function raises a `RuntimeError` preventing potentially unsafe model weight loading.

## Security and Performance Implications

The divergence between these two model weight loading pathways has significant operational consequences for ML pipelines.

### CVE-2025-32434 and torch.load Restrictions

The requirement for PyTorch 2.6+ when loading `.bin` files stems from a pickle deserialization vulnerability tracked as CVE‑2025‑32434. The `check_torch_load_is_safe()` guard in [`import_utils.py`](https://github.com/huggingface/transformers/blob/main/import_utils.py) ensures that users cannot accidentally execute malicious code embedded in legacy checkpoint files on vulnerable PyTorch versions.

Safetensors checkpoints are immune to this vulnerability because they bypass Python's pickle mechanism entirely, using a custom binary format that only stores tensor shapes, dtypes, and raw byte buffers.

### Memory Efficiency Benefits

Safetensors provides **memory-mapped file loading**, allowing the operating system to load tensor pages on demand rather than copying the entire checkpoint into RAM before transferring to GPU. This reduces peak memory consumption during model weight loading, particularly for large models like LLMs where checkpoints may exceed 100GB.

The legacy `.bin` format requires full deserialization into Python objects before the state dict can be applied to the model, consuming additional memory and CPU cycles during the pickle unpickling process.

## Practical Code Examples

### Loading with Default Safetensors Preference

```python
from transformers import AutoModel

# Automatically selects model.safetensors if present

model = AutoModel.from_pretrained("meta-llama/Llama-2-7b-hf")

```

When executing this code, Transformers checks for `SAFE_WEIGHTS_NAME` (`"model.safetensors"`) in the cache directory and invokes `safetensors.torch.load_file()` if found.

### Forcing Legacy .bin Format

```python
model = AutoModel.from_pretrained(
    "bert-base-uncased",
    prefer_safe=False  # Bypasses safetensors, forces torch.load on .bin

)

```

This triggers the fallback branch in [`trainer.py`](https://github.com/huggingface/transformers/blob/main/trainer.py) and [`core_model_loading.py`](https://github.com/huggingface/transformers/blob/main/core_model_loading.py), requiring PyTorch 2.6+ to pass the `check_torch_load_is_safe()` validation.

### Manual Safetensors Loading

```python
import safetensors.torch
from transformers import AutoModel

# Direct file loading without from_pretrained

state_dict = safetensors.torch.load_file("model.safetensors", device="cpu")

model = AutoModel.from_config(config)
model.load_state_dict(state_dict, strict=False)

```

This demonstrates the zero-copy loading mechanism that underlies the automatic pipeline.

### Safe Legacy Loading with Version Check

```python
from transformers.utils.import_utils import check_torch_load_is_safe
import torch

# Explicit safety validation

check_torch_load_is_safe()  # Raises RuntimeError if torch < 2.6

state_dict = torch.load(
    "pytorch_model.bin",
    map_location="cpu",
    weights_only=True
)

```

This mirrors the internal safety logic that protects against CVE‑2025‑32434.

## Summary

- **Default Behavior**: Transformers automatically prefers **Safetensors** (`.safetensors`) for model weight loading, using `safetensors.torch.load_file()` for zero-copy, memory-mapped access.
- **Security Model**: Safetensors eliminates pickle deserialization vulnerabilities entirely, while legacy `.bin` files require **PyTorch 2.6+** and `weights_only=True` to mitigate CVE‑2025‑32434 via `check_torch_load_is_safe()`.
- **File Detection**: The library checks for `SAFE_WEIGHTS_NAME` constants in [`src/transformers/utils/__init__.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/__init__.py) and implements the selection logic in [`src/transformers/trainer.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer.py) and [`core_model_loading.py`](https://github.com/huggingface/transformers/blob/main/core_model_loading.py).
- **Backward Compatibility**: Setting `prefer_safe=False` forces legacy `.bin` loading, and the library maintains support for both formats in sharded and non-sharded checkpoints.

## Frequently Asked Questions

### What is the default format for model weight loading in Transformers?

The default format is **Safetensors** (`.safetensors`). When you call `from_pretrained()`, the library looks for `model.safetensors` first based on the `SAFE_WEIGHTS_NAME` constant defined in [`src/transformers/utils/__init__.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/__init__.py). If present, it loads via `safetensors.torch.load_file()`; otherwise, it falls back to `pytorch_model.bin` with safety checks.

### Why does Transformers require PyTorch 2.6 for .bin files?

Transformers enforces PyTorch 2.6 or newer when loading legacy `.bin` checkpoints to protect against **CVE‑2025‑32434**, a critical vulnerability in Python's pickle module that `torch.load` uses internally. The `check_torch_load_is_safe()` function in [`src/transformers/utils/import_utils.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/utils/import_utils.py) raises a `RuntimeError` if the installed version is older, ensuring `weights_only=True` operates securely.

### Can I convert existing .bin checkpoints to safetensors?

Yes. The Transformers library includes conversion utilities in [`src/transformers/safetensors_conversion.py`](https://github.com/huggingface/transformers/blob/main/src/transformers/safetensors_conversion.py) that can transform legacy `.bin` checkpoints into the Safetensors format. Additionally, when you upload a model to the Hugging Face Hub, the platform often automatically converts `.bin` files to `.safetensors` variants if the safe files are missing, making the secure format available for future downloads.

### How does sharding work with safetensors vs .bin formats?

Both formats support sharded checkpoints through index files. For Safetensors, the library looks for [`model.safetensors.index.json`](https://github.com/huggingface/transformers/blob/main/model.safetensors.index.json) (defined as `SAFE_WEIGHTS_INDEX_NAME` in [`utils/__init__.py`](https://github.com/huggingface/transformers/blob/main/utils/__init__.py)), while legacy checkpoints use [`pytorch_model.bin.index.json`](https://github.com/huggingface/transformers/blob/main/pytorch_model.bin.index.json). The loading logic in [`trainer.py`](https://github.com/huggingface/transformers/blob/main/trainer.py) and [`core_model_loading.py`](https://github.com/huggingface/transformers/blob/main/core_model_loading.py) handles sharded loading transparently: `safetensors.torch.load_file` reads tensor slices from multiple files for the safe format, while sharded `.bin` loading uses the `load_sharded_checkpoint` utility with the same PyTorch 2.6 safety requirements.