How Pyrefly Handles Dataclass Transformations and Field Metadata

Pyrefly treats classes decorated with @dataclass or @dataclass_transform as dataclasses by parsing decorator arguments into structured metadata, synthesizing methods like __init__ and __eq__, and enforcing constraints on field descriptors and frozen inheritance.

The facebook/pyrefly type checker implements dataclass transformations through a sophisticated pipeline that converts Python decorator syntax into rich type-level representations. This process handles everything from transform-level defaults to per-field metadata extraction, ensuring that synthesized class members match Python's specification and Pyright's behavior.

Parsing @dataclass and @dataclass_transform Arguments

Pyrefly begins by extracting configuration from decorators, distinguishing between transform-level defaults and concrete class-level options.

Transform-Level Defaults

When processing @dataclass_transform, Pyrefly stores default values in the DataclassTransformMetadata struct defined in crates/pyrefly_types/src/keywords.rs:

#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Hash)]
#[derive(Visit, VisitMut, TypeEq)]
pub struct DataclassTransformMetadata {
    pub eq_default: bool,
    pub order_default: bool,
    pub kw_only_default: bool,
    pub frozen_default: bool,
    pub field_specifiers: Vec<CalleeKind>,
}

Source: crates/pyrefly_types/src/keywords.rs#L62-L71

The from_type_map constructor reads these values from the decorator's keyword arguments, applying Python specification defaults (e.g., eq=True when omitted) as implemented in lines 81-94 of the same file.

Class-Level Options

The concrete options for a specific class are stored in DataclassKeywords:

pub struct DataclassKeywords {
    pub init: bool,
    pub order: bool,
    pub frozen: bool,
    pub match_args: bool,
    pub kw_only: bool,
    pub eq: bool,
    pub unsafe_hash: bool,
    pub slots: bool,
    pub extra: bool,
    pub strict: bool,
}

Source: crates/pyrefly_types/src/keywords.rs#L91-L102

DataclassKeywords::from_type_map merges raw decorator arguments with DataclassTransformMetadata defaults, yielding the final flags that drive synthesis (lines 22-49).

Extracting Per-Field Metadata

Each dataclass field carries its own configuration for initialization, defaults, and converters.

Field Keyword Structure

Pyrefly represents field parameters using DataclassFieldKeywords:

pub struct DataclassFieldKeywords {
    pub init: bool,
    pub default: Option<Type>,
    pub kw_only: Option<bool>,
    pub init_by_name: bool,
    pub init_by_alias: Option<Name>,
    pub lt: Option<Type>,
    pub gt: Option<Type>,
    pub ge: Option<Type>,
    pub le: Option<Type>,
    pub strict: Option<bool>,
    pub converter_param: Option<Type>,
}

Source: crates/pyrefly_types/src/keywords.rs#L33-L55

Resolution from Field Specifiers

The AnswersSolver::dataclass_field_keywords function in pyrefly/lib/alt/class/dataclass.rs builds this struct for each field:

pub fn dataclass_field_keywords(
    &self,
    func: &Type,
    args: &Arguments,
    dataclass_metadata: &DataclassMetadata,
    errors: &ErrorCollector,
) -> DataclassFieldKeywords {
    // Builds TypeMap from call-site arguments
    // Fills missing values from field-specifier signature...
}

Source: alt/class/dataclass.rs#L61-L78

When a field uses a callable specifier (like dataclasses.field or custom classes from field_specifiers), Pyrefly resolves the specifier's __init__ signature to read default values. This logic in fill_in_field_keywords_from_function_signature (lines 442-494) mirrors Pyright's PEP 681 implementation.

Synthesizing Dataclass Members

With metadata collected, Pyrefly generates synthetic class members while enforcing type safety constraints.

__init__ Generation

The AnswersSolver::get_dataclass_init method constructs the initialization signature by iterating over fields and respecting:

  • init flags – omitting fields marked init=False
  • kw_only settings – determining positional vs keyword-only parameters
  • Default values – making parameters optional when defaults exist, while validating that non-default parameters do not follow defaults (lines 889-914)
  • strict mode – applying converter tables when enabled (lines 923-933)

Source: alt/class/dataclass.rs#L779-L845

Additional Synthetic Members

Pyrefly conditionally generates other members based on configuration flags:

  • __match_args__ – built by get_dataclass_match_args (lines 972-996)
  • __slots__ – constructed in get_dataclass_slots (lines 998-1010)
  • Rich comparisons__eq__, __lt__, etc., produced by get_dataclass_rich_comparison_methods (lines 1012-1042)
  • __hash__ – generated based on unsafe_hash, eq, and frozen interactions (lines 1044-1052)

These become ClassSynthesizedField entries added to the class's type record via get_dataclass_synthesized_fields.

Frozen Inheritance Validation

Dataclasses must not mix frozen and non-frozen bases unless the subclass originates from a @dataclass_transform. The validate_frozen_dataclass_inheritance function (lines 776-824) walks the inheritance chain and emits InvalidInheritance errors for violations unless the "frozen subclass of non-frozen base" condition and is_from_dataclass_transform flag both hold.

Descriptor Safety Checks

For descriptor fields, Pyrefly enforces compatibility constraints:

  • Non-data descriptors are disallowed unless __get__ returns Self (lines 254-267)
  • Data descriptors require that __get__ return types are assignable to __set__ value types (lines 286-342)

These checks prevent unsound shadowing when the dataclass-generated __init__ interacts with descriptor protocols.

Source: alt/class/dataclass.rs#L254-L342

Complete Workflow Example

Consider this Python code utilizing custom transforms:

from typing import dataclass_transform, field

@dataclass_transform(eq_default=True, frozen_default=False, field_specifiers=(field,))
def my_transform(cls):
    return cls

@my_transform
class Point:
    x: int
    y: int = field(default=0, init=False)
    label: str = "origin"

Pyrefly processes this through three phases:

  1. Decorator Analysis – Extracts eq_default=True and frozen_default=False from my_transform

  2. Field Collection

    • x receives init=True, no default
    • y receives init=False from the field specifier
    • label receives init=True, default "origin"
  3. Member Synthesis – Generates:

    def __init__(self, x: int, label: str = "origin") -> None: ...

    Plus __eq__ and __repr__, but excludes y from initialization and skips __slots__ since the flag is false.

Summary

  • Transform metadata (DataclassTransformMetadata) provides default values for decorator arguments like eq and frozen
  • Class options (DataclassKeywords) merge transform defaults with explicit decorator arguments
  • Field metadata (DataclassFieldKeywords) captures per-field configuration, filling defaults from field-specifier signatures when necessary
  • Synthetic members are generated via get_dataclass_init and related functions, respecting initialization flags, default ordering, and keyword-only settings
  • Safety validations ensure descriptor compatibility and prevent invalid frozen/non-frozen inheritance chains

Frequently Asked Questions

What is the difference between DataclassTransformMetadata and DataclassKeywords?

DataclassTransformMetadata stores defaults provided by @dataclass_transform decorators (like eq_default or field_specifiers), while DataclassKeywords holds the final resolved options for a specific class after merging transform defaults with explicit @dataclass arguments. The former lives in pyrefly_types/src/keywords.rs and configures transform behavior; the latter represents the concrete flags used during member synthesis.

How does Pyrefly handle field defaults from custom field specifiers?

When a field uses a custom specifier callable, Pyrefly resolves its __init__ signature via fill_in_field_keywords_from_function_signature in alt/class/dataclass.rs. This extracts default values for parameters like init, kw_only, or converter from the specifier's definition, applying them when the field declaration omits explicit values.

What validation does Pyrefly perform on dataclass descriptors?

Pyrefly enforces two key constraints in alt/class/dataclass.rs: non-data descriptors must have __get__ return Self, and data descriptor defaults must have compatible __get__ return types and __set__ value types. These validations prevent type unsoundness where the synthesized __init__ might incorrectly override descriptor behavior.

Can frozen and non-frozen dataclasses inherit from each other in Pyrefly?

Generally no—Pyrefly raises an InvalidInheritance error when a frozen dataclass inherits from a non-frozen one or vice versa. However, validate_frozen_dataclass_inheritance permits frozen subclasses of non-frozen bases specifically when the subclass originates from a @dataclass_transform that sets frozen_default=True.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →