How the 16MB Model Artifact Size Limit Is Enforced in OpenAI Parameter Golf

The 16MB model artifact size limit is enforced by calculating the combined byte size of the compressed, quantized model weights and the accompanying source code, then comparing this total against a hard 16,000,000-byte cap defined in train_gpt.py.

The openai/parameter-golf repository implements a strict 16MB model artifact size limit to ensure fair competition in parameter-efficient training challenges. This limit applies to the final submission bundle, which includes both the serialized model weights and the source code used for training. Understanding exactly how this limit is calculated and where it is enforced is critical for participants aiming to optimize their model architecture and quantization strategy.

Where the 16MB Limit Is Defined

The competition rules explicitly document the 16,000,000-byte constraint in the repository’s README. According to line 178 of README.md, the cap is defined as 16,000,000 bytes (decimal), not 16 MiB (which would be 16,777,216 bytes).

The actual enforcement logic resides in the primary training script train_gpt.py at lines 1243–1249. This is where the final artifact size is computed and validated against the limit before submission.

How Artifact Size Is Calculated

The total artifact size is the sum of two components: the compressed model weights and the UTF-8 encoded source code. The calculation follows a specific serialization pipeline designed to maximize compression efficiency.

Model Quantization and Serialization

Before size calculation, the model’s state dictionary undergoes aggressive quantization to reduce memory footprint. The repository uses quantize_state_dict_int6() (or int8 variants) to convert floating-point weights into low-precision integers.

The quantized state is then serialized into an in-memory io.BytesIO buffer using torch.save():

state = base_model.state_dict()
quant_obj, quant_stats = quantize_state_dict_int6(state)
quant_buf = io.BytesIO()
torch.save(quant_obj, quant_buf)

Compression Strategy

The raw bytes from the serialization buffer are compressed using compress_bytes(), which implements a fallback strategy:

  1. Primary: zstd-22 (Zstandard compression level 22) if the zstandard library is available
  2. Fallback: zlib-9 (DEFLATE compression level 9) if zstd is unavailable
quant_blob = compress_bytes(quant_buf.getvalue())

The resulting quant_blob represents the compressed model weights that contribute to the final size calculation.

Adding Source Code Size

The submission must include the training source code. The system calculates the UTF-8 byte length of the source code string:

code_bytes = len(code.encode("utf-8"))
total_artifact = len(quant_blob) + code_bytes

This total_artifact value is the definitive metric compared against the 16,000,000-byte limit.

The Enforcement Mechanism in train_gpt.py

The enforcement occurs in the main training loop of train_gpt.py (lines 1243–1249) when the master process prepares the final submission. The logic performs a simple numeric comparison:

if total_artifact > 16_000_000:
    log0(f"WARNING: artifact {total_artifact} exceeds 16,000,000 byte cap "
         f"by {total_artifact - 16_000_000} bytes!")
else:
    log0(f"artifact headroom: {16_000_000 - total_artifact} bytes "
         f"({(16_000_000 - total_artifact)/1e6:.3f}MB)")

Critical behavior: The script does not abort if the limit is exceeded. It merely logs a warning message indicating how many bytes the submission exceeds the cap. Participants must monitor these logs and manually adjust their model architecture, quantization precision, or code size until the warning disappears.

Code Example: Checking Artifact Size Programmatically

You can implement a standalone verification function to check your model against the limit before final submission:

import io
import torch
from parameter_golf.quantization import quantize_state_dict_int6, compress_bytes

def compute_artifact_size(state_dict: dict, source_code: str) -> int:
    """
    Calculate total submission artifact size in bytes.
    
    Args:
        state_dict: Model state dictionary
        source_code: String containing training source code
    
    Returns:
        Total bytes (compressed model + code)
    """
    # Quantize and serialize model

    quant_obj, _ = quantize_state_dict_int6(state_dict)
    buf = io.BytesIO()
    torch.save(quant_obj, buf)
    
    # Compress and calculate sizes

    compressed_model = compress_bytes(buf.getvalue())
    code_bytes = len(source_code.encode('utf-8'))
    
    return len(compressed_model) + code_bytes

# Example usage

if __name__ == "__main__":
    # Load your trained model and source code

    model = torch.load("checkpoint.pt")
    with open("train_gpt.py", "r", encoding="utf-8") as f:
        code = f.read()
    
    total_bytes = compute_artifact_size(model.state_dict(), code)
    
    if total_bytes > 16_000_000:
        print(f"❌ Exceeds limit by {total_bytes - 16_000_000} bytes")
    else:
        print(f"✅ Within limit. Headroom: {16_000_000 - total_bytes} bytes")

This utility allows you to iterate on model compression strategies without waiting for the full training run to complete.

Summary

  • The 16MB limit is strictly defined as 16,000,000 bytes (decimal), not 16 MiB, as documented in README.md line 178.
  • Enforcement occurs in train_gpt.py lines 1243–1249, where the script calculates total_artifact as the sum of compressed model weights and source code bytes.
  • The model is first quantized to int6/int8, serialized to a BytesIO buffer, then compressed using zstd-22 (preferred) or zlib-9.
  • The system logs a warning if the limit is exceeded but does not automatically abort training, requiring manual intervention to reduce model size or code length.

Frequently Asked Questions

What exactly counts toward the 16MB artifact size limit?

The total artifact size includes two components: the compressed model weights and the UTF-8 encoded source code. The model weights are quantized to int6 or int8, serialized using torch.save() to a BytesIO buffer, then compressed with zstd-22 or zlib-9. The source code size is calculated as len(code.encode("utf-8")). Both values are summed to produce total_artifact, which must be ≤ 16,000,000 bytes.

Does the training script automatically stop if I exceed the 16MB limit?

No, the training script does not abort automatically. According to the implementation in train_gpt.py lines 1243–1249, the script only emits a WARNING log message indicating how many bytes the submission exceeds the cap. Participants must monitor these logs and manually adjust their model architecture, quantization precision, or code size until the artifact fits within the 16,000,000-byte limit before final submission.

What compression algorithm is used for the model artifact?

The repository uses a cascading compression strategy implemented in compress_bytes(). The primary algorithm is zstd-22 (Zstandard compression at level 22), which provides superior compression ratios for quantized model weights. If the zstandard library is not available in the environment, the system falls back to zlib-9 (DEFLATE compression at level 9). This ensures consistent artifact sizes across different deployment environments while maximizing the available parameter budget.

How can I check my artifact size locally before submitting?

You can replicate the official size calculation using the quantize_state_dict_int6() and compress_bytes() functions provided in the repository. Load your model’s state dictionary, quantize it, serialize it to a BytesIO buffer using torch.save(), compress the buffer contents, then add the UTF-8 byte length of your source code. Compare this total against 16_000_000 to verify compliance before final submission.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →