How to Create a Custom Model Architecture that Integrates with AutoModel: A Complete Guide

To integrate a custom model with Hugging Face Transformers' AutoModel ecosystem, you must define a PreTrainedConfig subclass with a unique model_type, implement a PreTrainedModel subclass exposing config_class, register both with AutoConfig.register and AutoModel.register, and use trust_remote_code=True when loading from remote repositories.

The Hugging Face Transformers library provides the AutoClass API (including AutoModel, AutoModelForImageClassification, and other task-specific variants) to dynamically instantiate model architectures from configuration objects. When you create a custom model architecture that integrates with AutoModel, you enable users to load your model using the standard from_pretrained workflow without modifying the library source code, gaining full compatibility with the Trainer API and Hub sharing capabilities.

Step 1: Define a Custom Configuration Class

Every model in the Transformers ecosystem requires a configuration object that specifies hyperparameters and architecture metadata. You must subclass PreTrainedConfig and assign a unique string to the model_type class attribute. This identifier serves as the key that AutoConfig uses to locate your model class.

According to the official documentation in docs/source/en/custom_models.md, the configuration must call super().__init__(**kwargs) to preserve parent fields like name_or_path and transformers_version. The model_type value must be unique to avoid collisions with existing architectures (e.g., "bert", "gpt2", "resnet").


# custom_resnet/configuration_resnet.py

from transformers import PreTrainedConfig

class ResnetConfig(PreTrainedConfig):
    """Configuration for a custom ResNet model."""
    model_type = "resnet"  # Unique identifier for AutoConfig

    def __init__(
        self,
        block_type="bottleneck",
        layers=[3, 4, 6, 3],
        num_classes=1000,
        input_channels=3,
        cardinality=1,
        base_width=64,
        stem_width=64,
        stem_type="",
        avg_down=False,
        **kwargs,
    ):
        if block_type not in ["basic", "bottleneck"]:
            raise ValueError("`block_type` must be 'basic' or 'bottleneck'")
        
        self.block_type = block_type
        self.layers = layers
        self.num_classes = num_classes
        self.input_channels = input_channels
        self.cardinality = cardinality
        self.base_width = base_width
        self.stem_width = stem_width
        self.stem_type = stem_type
        self.avg_down = avg_down
        super().__init__(**kwargs)  # Preserve parent configuration fields

Step 2: Implement the Model Class

Your model must inherit from PreTrainedModel (or task-specific bases like PreTrainedModelForImageClassification) and expose the configuration class via the config_class attribute. This attribute binds the model to its configuration, enabling AutoModel to instantiate the correct class when loading from a config file.

As implemented in the base classes, PreTrainedModel provides critical methods like save_pretrained, from_pretrained, and gradient checkpointing utilities. The following example shows both a backbone model and a task-specific classification head, both referencing ResnetConfig via config_class.


# custom_resnet/modeling_resnet.py

import torch
from transformers import PreTrainedModel
from timm.models.resnet import BasicBlock, Bottleneck, ResNet
from .configuration_resnet import ResnetConfig

BLOCK_MAPPING = {"basic": BasicBlock, "bottleneck": Bottleneck}

class ResnetModel(PreTrainedModel):
    """Backbone that returns hidden features."""
    config_class = ResnetConfig  # Links model to configuration

    def __init__(self, config: ResnetConfig):
        super().__init__(config)
        block = BLOCK_MAPPING[config.block_type]
        self.model = ResNet(
            block,
            config.layers,
            num_classes=config.num_classes,
            in_chans=config.input_channels,
            cardinality=config.cardinality,
            base_width=config.base_width,
            stem_width=config.stem_width,
            stem_type=config.stem_type,
            avg_down=config.avg_down,
        )

    def forward(self, pixel_values):
        return self.model.forward_features(pixel_values)


class ResnetModelForImageClassification(PreTrainedModel):
    """Classification head on top of the backbone."""
    config_class = ResnetConfig

    def __init__(self, config: ResnetConfig):
        super().__init__(config)
        block = BLOCK_MAPPING[config.block_type]
        self.model = ResNet(
            block,
            config.layers,
            num_classes=config.num_classes,
            in_chans=config.input_channels,
            cardinality=config.cardinality,
            base_width=config.base_width,
            stem_width=config.stem_width,
            stem_type=config.stem_type,
            avg_down=config.avg_down,
        )

    def forward(self, pixel_values, labels=None):
        logits = self.model(pixel_values)
        if labels is not None:
            loss = torch.nn.functional.cross_entropy(logits, labels)
            return {"loss": loss, "logits": logits}
        return {"logits": logits}

Step 3: Register with Auto Classes

Registration updates the global _LazyAutoMapping in src/transformers/models/auto/auto_factory.py, allowing AutoModel.from_pretrained to resolve your custom class from the configuration's model_type. You must register the configuration with AutoConfig, then register the model classes with the appropriate AutoModel variants.

The register method in _BaseAutoModelClass (the base for all Auto classes) accepts the config class and model class as arguments, inserting them into the mapping dictionary that from_pretrained consults at runtime.

from transformers import AutoConfig, AutoModel, AutoModelForImageClassification
from custom_resnet.configuration_resnet import ResnetConfig
from custom_resnet.modeling_resnet import ResnetModel, ResnetModelForImageClassification

# Register configuration type

AutoConfig.register("resnet", ResnetConfig)

# Register backbone for generic AutoModel

AutoModel.register(ResnetConfig, ResnetModel)

# Register task-specific model

AutoModelForImageClassification.register(ResnetConfig, ResnetModelForImageClassification)

Step 4: Save, Load, and Distribute

Once registered, your custom model architecture supports the full Transformers persistence API. When saving, both the configuration (config.json) and model weights (pytorch_model.bin or model.safetensors) are written to the specified directory.

For models hosted on the Hugging Face Hub or any remote repository, you must pass trust_remote_code=True to from_pretrained. This flag, validated in _BaseAutoModelClass.from_pretrained within src/transformers/models/auto/auto_factory.py, allows dynamic execution of the custom Python files required to instantiate your architecture.


# Save locally

config = ResnetConfig()
config.save_pretrained("custom_resnet")
model = ResnetModelForImageClassification(config)
model.save_pretrained("custom_resnet")

# Load from local directory

local_model = AutoModelForImageClassification.from_pretrained(
    "custom_resnet", trust_remote_code=True
)

# Load from the Hub after pushing

hub_model = AutoModelForImageClassification.from_pretrained(
    "username/custom-resnet", trust_remote_code=True
)

Summary

  • Custom Configuration: Subclass PreTrainedConfig, set a unique model_type, and call super().__init__(**kwargs) to maintain compatibility with the Transformers ecosystem.
  • Model Implementation: Inherit from PreTrainedModel, assign your config class to config_class, and implement the forward pass.
  • Registration: Use AutoConfig.register and AutoModel.register (or task-specific variants) to insert your classes into the global _LazyAutoMapping in src/transformers/models/auto/auto_factory.py.
  • Remote Execution: Always specify trust_remote_code=True when loading custom models from the Hub to enable dynamic code execution.

Frequently Asked Questions

What happens if I don't specify a unique model_type?

If the model_type in your configuration conflicts with an existing model (e.g., "bert" or "gpt2"), AutoConfig will resolve to the built-in class associated with that type, causing AutoModel.from_pretrained to instantiate the wrong architecture or raise a validation error when the configuration parameters don't match the expected schema.

Why is trust_remote_code=True mandatory for custom models?

The trust_remote_code=True flag is required because AutoModel.from_pretrained must execute arbitrary Python code from your repository (specifically your modeling_*.py and configuration_*.py files) to instantiate classes that don't exist in the core library. As noted in the _BaseAutoModelClass.from_pretrained implementation in src/transformers/models/auto/auto_factory.py, this security gate prevents silent execution of untrusted code.

Can I register multiple task-specific heads for the same architecture?

Yes. You can register one backbone class with AutoModel and multiple task-specific classes (e.g., AutoModelForImageClassification, AutoModelForSemanticSegmentation) with the same ResnetConfig. Each registration maps the config to a different model class within the respective Auto class's registry, allowing users to load the appropriate head for their task.

Where does the registration logic update the internal mappings?

The registration logic lives in src/transformers/models/auto/auto_factory.py within the _BaseAutoModelClass.register method. This method updates the _LazyAutoMapping dictionary that from_pretrained and from_config consult to resolve model_type strings to Python classes, effectively making your custom model a first-class citizen of the Auto ecosystem.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →