# How the 16MB Model Artifact Size Limit Is Enforced in OpenAI Parameter Golf

> Discover how OpenAI enforces the 16MB model artifact size limit by calculating combined compressed weights and source code against a hard cap.

- Repository: [OpenAI/parameter-golf](https://github.com/openai/parameter-golf)
- Tags: internals
- Published: 2026-04-17

---

**The 16MB model artifact size limit is enforced by calculating the combined byte size of the compressed, quantized model weights and the accompanying source code, then comparing this total against a hard 16,000,000-byte cap defined in [`train_gpt.py`](https://github.com/openai/parameter-golf/blob/main/train_gpt.py).**

The openai/parameter-golf repository implements a strict 16MB model artifact size limit to ensure fair competition in parameter-efficient training challenges. This limit applies to the final submission bundle, which includes both the serialized model weights and the source code used for training. Understanding exactly how this limit is calculated and where it is enforced is critical for participants aiming to optimize their model architecture and quantization strategy.

## Where the 16MB Limit Is Defined

The competition rules explicitly document the 16,000,000-byte constraint in the repository’s README. According to line 178 of [`README.md`](https://github.com/openai/parameter-golf/blob/main/README.md), the cap is defined as **16,000,000 bytes** (decimal), not 16 MiB (which would be 16,777,216 bytes).

The actual enforcement logic resides in the primary training script [`train_gpt.py`](https://github.com/openai/parameter-golf/blob/main/train_gpt.py) at lines 1243–1249. This is where the final artifact size is computed and validated against the limit before submission.

## How Artifact Size Is Calculated

The total artifact size is the sum of two components: the compressed model weights and the UTF-8 encoded source code. The calculation follows a specific serialization pipeline designed to maximize compression efficiency.

### Model Quantization and Serialization

Before size calculation, the model’s state dictionary undergoes aggressive quantization to reduce memory footprint. The repository uses `quantize_state_dict_int6()` (or int8 variants) to convert floating-point weights into low-precision integers.

The quantized state is then serialized into an in-memory `io.BytesIO` buffer using `torch.save()`:

```python
state = base_model.state_dict()
quant_obj, quant_stats = quantize_state_dict_int6(state)
quant_buf = io.BytesIO()
torch.save(quant_obj, quant_buf)

```

### Compression Strategy

The raw bytes from the serialization buffer are compressed using `compress_bytes()`, which implements a fallback strategy:

1. **Primary**: **zstd-22** (Zstandard compression level 22) if the zstandard library is available
2. **Fallback**: **zlib-9** (DEFLATE compression level 9) if zstd is unavailable

```python
quant_blob = compress_bytes(quant_buf.getvalue())

```

The resulting `quant_blob` represents the compressed model weights that contribute to the final size calculation.

### Adding Source Code Size

The submission must include the training source code. The system calculates the UTF-8 byte length of the source code string:

```python
code_bytes = len(code.encode("utf-8"))
total_artifact = len(quant_blob) + code_bytes

```

This `total_artifact` value is the definitive metric compared against the 16,000,000-byte limit.

## The Enforcement Mechanism in train_gpt.py

The enforcement occurs in the main training loop of [`train_gpt.py`](https://github.com/openai/parameter-golf/blob/main/train_gpt.py) (lines 1243–1249) when the master process prepares the final submission. The logic performs a simple numeric comparison:

```python
if total_artifact > 16_000_000:
    log0(f"WARNING: artifact {total_artifact} exceeds 16,000,000 byte cap "
         f"by {total_artifact - 16_000_000} bytes!")
else:
    log0(f"artifact headroom: {16_000_000 - total_artifact} bytes "
         f"({(16_000_000 - total_artifact)/1e6:.3f}MB)")

```

**Critical behavior**: The script **does not abort** if the limit is exceeded. It merely logs a warning message indicating how many bytes the submission exceeds the cap. Participants must monitor these logs and manually adjust their model architecture, quantization precision, or code size until the warning disappears.

## Code Example: Checking Artifact Size Programmatically

You can implement a standalone verification function to check your model against the limit before final submission:

```python
import io
import torch
from parameter_golf.quantization import quantize_state_dict_int6, compress_bytes

def compute_artifact_size(state_dict: dict, source_code: str) -> int:
    """
    Calculate total submission artifact size in bytes.
    
    Args:
        state_dict: Model state dictionary
        source_code: String containing training source code
    
    Returns:
        Total bytes (compressed model + code)
    """
    # Quantize and serialize model

    quant_obj, _ = quantize_state_dict_int6(state_dict)
    buf = io.BytesIO()
    torch.save(quant_obj, buf)
    
    # Compress and calculate sizes

    compressed_model = compress_bytes(buf.getvalue())
    code_bytes = len(source_code.encode('utf-8'))
    
    return len(compressed_model) + code_bytes

# Example usage

if __name__ == "__main__":
    # Load your trained model and source code

    model = torch.load("checkpoint.pt")
    with open("train_gpt.py", "r", encoding="utf-8") as f:
        code = f.read()
    
    total_bytes = compute_artifact_size(model.state_dict(), code)
    
    if total_bytes > 16_000_000:
        print(f"❌ Exceeds limit by {total_bytes - 16_000_000} bytes")
    else:
        print(f"✅ Within limit. Headroom: {16_000_000 - total_bytes} bytes")

```

This utility allows you to iterate on model compression strategies without waiting for the full training run to complete.

## Summary

- The **16MB limit** is strictly defined as **16,000,000 bytes** (decimal), not 16 MiB, as documented in [`README.md`](https://github.com/openai/parameter-golf/blob/main/README.md) line 178.
- Enforcement occurs in **[`train_gpt.py`](https://github.com/openai/parameter-golf/blob/main/train_gpt.py) lines 1243–1249**, where the script calculates `total_artifact` as the sum of compressed model weights and source code bytes.
- The model is first **quantized to int6/int8**, serialized to a `BytesIO` buffer, then compressed using **zstd-22** (preferred) or **zlib-9**.
- The system **logs a warning** if the limit is exceeded but does **not** automatically abort training, requiring manual intervention to reduce model size or code length.

## Frequently Asked Questions

### What exactly counts toward the 16MB artifact size limit?

The total artifact size includes two components: the **compressed model weights** and the **UTF-8 encoded source code**. The model weights are quantized to int6 or int8, serialized using `torch.save()` to a `BytesIO` buffer, then compressed with zstd-22 or zlib-9. The source code size is calculated as `len(code.encode("utf-8"))`. Both values are summed to produce `total_artifact`, which must be ≤ 16,000,000 bytes.

### Does the training script automatically stop if I exceed the 16MB limit?

No, the training script does **not** abort automatically. According to the implementation in [`train_gpt.py`](https://github.com/openai/parameter-golf/blob/main/train_gpt.py) lines 1243–1249, the script only emits a **WARNING** log message indicating how many bytes the submission exceeds the cap. Participants must monitor these logs and manually adjust their model architecture, quantization precision, or code size until the artifact fits within the 16,000,000-byte limit before final submission.

### What compression algorithm is used for the model artifact?

The repository uses a cascading compression strategy implemented in `compress_bytes()`. The **primary** algorithm is **zstd-22** (Zstandard compression at level 22), which provides superior compression ratios for quantized model weights. If the zstandard library is not available in the environment, the system falls back to **zlib-9** (DEFLATE compression at level 9). This ensures consistent artifact sizes across different deployment environments while maximizing the available parameter budget.

### How can I check my artifact size locally before submitting?

You can replicate the official size calculation using the `quantize_state_dict_int6()` and `compress_bytes()` functions provided in the repository. Load your model’s state dictionary, quantize it, serialize it to a `BytesIO` buffer using `torch.save()`, compress the buffer contents, then add the UTF-8 byte length of your source code. Compare this total against `16_000_000` to verify compliance before final submission.