# Protobuf Packed vs Unpacked Encoding for Repeated Fields: A Complete Guide

> Explore Protobuf packed vs unpacked encoding for repeated fields. Learn how packed encoding reduces message size and improves parsing performance. A complete guide.

- Repository: [Protocol Buffers/protobuf](https://github.com/protocolbuffers/protobuf)
- Tags: deep-dive
- Published: 2026-03-02

---

**Repeated scalar fields in Protocol Buffers can be encoded either as individual tag-value pairs (unpacked) or as a single length-delimited block (packed), with packed encoding reducing message size by eliminating per-element tags and improving parsing performance.**

Protocol Buffers (protobuf) offers two distinct wire formats for encoding repeated scalar fields. Understanding the differences between protobuf packed vs unpacked encoding for repeated fields is essential for optimizing message size and deserialization performance in high-throughput systems. This guide examines the implementation details in the `protocolbuffers/protobuf` repository, covering wire format internals, compatibility guarantees, and practical C++ examples.

## What Packed and Unpacked Encoding Mean

In protobuf wire format, a repeated field serializes each element sequentially. The encoding method determines whether each element carries its own tag or shares a single tag as a block.

### Wire Format Differences

| Encoding | Wire Type | Layout |
|----------|-----------|--------|
| **Unpacked** | `VARINT`, `FIXED32`, `FIXED64`, or `LEN-DELIMITED` (depends on scalar type) | Each element is written as a full tag + value pair. |
| **Packed** | `LEN-DELIMITED` | All elements are concatenated into a single length-delimited block; the tag is written once, followed by the total byte count and the raw values. |

The packed form reduces message size and improves parsing speed for large repeated numeric fields because it eliminates the 1-byte tag overhead per element.

### Which Types Support Packing

Packed encoding is only allowed for **numeric scalar types**: `int32`, `int64`, `uint32`, `uint64`, `sint32`, `sint64`, `bool`, `enum`, `fixed32`, `fixed64`, `sfixed32`, `sfixed64`, `float`, and `double`.

Message-type repeated fields can **never** be packed because each sub-message requires its own tag to delimit its length.

## How the Protobuf Library Chooses the Encoding

The decision between packed and unpacked happens at three layers: descriptor inspection, serialization, and deserialization.

### Descriptor Level

The `FieldDescriptor::is_packed()` accessor in `src/google/protobuf/descriptor.cc` (around line 4248) tells the runtime whether the field was declared with the `packed=true` option or the proto3 default.

```cpp
// Conceptual representation from descriptor.cc
bool FieldDescriptor::is_packed() const {
  // Returns true if packed option is set or proto3 default applies
  return internal::cpp::IsFieldPacked(this);
}

```

### Serialization Logic

When `WireFormat::SerializeWithCachedSizes` (or the low-level `WireFormatLite`) processes a repeated field, it checks `field->is_packed()` to select the code path. The implementation in `src/google/protobuf/wire_format.cc` (lines 1269-1295) handles the packed case:

```cpp
if (field->is_packed()) {
  // Write a length-delimited block with all values
  target = stream->Write##TYPE_METHOD##Packed(...);
}

```

For packed fields, the serializer concatenates all primitive values into a contiguous byte array, prefixes it with the field tag and total length, and writes it as a single `LEN-DELIMITED` record.

### Deserialization Logic

During parsing, the generic decoder in `WireFormat` examines the wire-type. If it encounters a length-delimited field where a packed field is expected, it forwards the payload to the packed-reader helpers in [`src/google/protobuf/wire_format_lite.h`](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wire_format_lite.h) (e.g., `ReadPackedPrimitive` around lines 306-311):

```cpp
template <typename CType, enum FieldType DeclaredType>
inline bool WireFormatLite::ReadPackedPrimitive(
    io::CodedInputStream* input, RepeatedField<CType>* values) {
  // Reads length-delimited block and parses each element
}

```

The parser also tolerates the unpacked representation for packed fields, enabling forward- and backward-compatible parsing across different protobuf versions.

## Wire Compatibility Between Packed and Unpacked

Protobuf guarantees that packed and unpacked encodings are mutually compatible. You can upgrade a field from unpacked to packed (or vice versa) without breaking wire compatibility.

### Packed to Unpacked Parsing

A message serialized with a packed field can be parsed by a decoder expecting unpacked fields. The parser detects the length-delimited payload, enters the packed-reading loop, and extracts each element individually. This behavior is exercised in the unit test `ParsePackedFromUnpacked` in [`src/google/protobuf/wire_format_unittest.h`](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wire_format_unittest.h) (lines 1314-1327).

### Unpacked to Packed Parsing

Conversely, a decoder expecting a packed field will accept the unpacked representation. The parser reads one tag/value pair at a time and appends each element to the repeated field. This is tested by `ParseUnpackedFromPacked` in [`src/google/protobuf/wire_format_unittest.h`](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wire_format_unittest.h) (lines 1429-1442).

Thus, both encodings are wire-compatible; the only difference is the on-the-wire size and parsing efficiency.

## When to Use Packed Encoding

### Proto3 Defaults

In **proto3**, repeated scalar fields default to packed encoding automatically. You do not need to specify any options to get the space-saving benefits.

### Proto2 Explicit Configuration

In **proto2**, repeated fields default to unpacked encoding. You must explicitly add the `[packed = true]` option to enable the packed format:

```proto
repeated int32 values = 1 [packed = true];

```

Use packed encoding for any repeated scalar field containing numeric types, especially when the field typically contains many elements.

## Performance Impact

### Message Size Reduction

A packed field stores each value as its raw binary encoding without per-element tags, cutting overhead by roughly **1 byte per element** (the tag) plus the length delimiter for the whole block. For a repeated field containing 1,000 integers, this can save over 1 KB per message.

### Parsing Speed

The parser can copy the raw bytes straight into the `RepeatedField` buffer, avoiding per-element tag checks and branch mispredictions. This vectorized approach significantly outperforms the unpacked loop for large collections.

## Code Examples

### Proto Definition

```proto
syntax = "proto3";

message Sample {
  // Unpacked (explicitly disabled)
  repeated int32 values_unpacked = 1 [packed = false];

  // Packed (default in proto3, explicit in proto2)
  repeated int32 values_packed = 2 [packed = true];
}

```

### C++ Serialization and Wire Inspection

```cpp
#include "sample.pb.h"
#include <iostream>
#include <iomanip>

int main() {
  Sample msg;
  msg.add_values_unpacked(10);
  msg.add_values_unpacked(20);
  msg.add_values_packed(10);
  msg.add_values_packed(20);

  std::string data;
  msg.SerializeToString(&data);

  // Hex-dump the raw bytes
  for (unsigned char c : data) {
    std::cout << std::hex << std::setw(2) << std::setfill('0')
              << static_cast<int>(c) << ' ';
  }
  std::cout << std::dec << '\n';
}

```

**Output (hex)**

```

08 0a 08 14   // field 1 (unpacked): tag 0x08, value 10; tag 0x08, value 20
12 04 0a 14   // field 2 (packed): tag 0x12, length 0x04, values 0a 14

```

**Explanation**

- `0x08` = (field 1 << 3) | VARINT → unpacked tag repeated twice.
- `0x12` = (field 2 << 3) | LEN-DELIMITED → packed block of two varints (`0a 14`).

The packed block size (`0x04`) is the total byte count of the two encoded varints.

### C++ Parsing Both Forms

```cpp
Sample decoded;

// ---- Parse packed data into an *unpacked* field ----
{
  std::string packed_data = "\x12\x04\x0a\x14";  // same as above
  decoded.ParseFromString(packed_data);
  // decoded.values_unpacked() will contain {10, 20}
}

// ---- Parse unpacked data into a *packed* field ----
{
  std::string unpacked_data = "\x08\x0a\x08\x14";  // same as above
  decoded.ParseFromString(unpacked_data);
  // decoded.values_packed() will contain {10, 20}
}

```

Both calls succeed because the parser accepts the cross-representation shown in the unit tests.

## Summary

- **Protobuf packed vs unpacked encoding for repeated fields** determines whether each scalar element carries its own tag (unpacked) or shares a single length-delimited block (packed).
- **Packed encoding** is only available for numeric scalar types (integers, floats, booleans, enums) and reduces message size by roughly one byte per element.
- **Proto3** defaults to packed for repeated scalars, while **proto2** defaults to unpacked unless explicitly configured.
- The `FieldDescriptor::is_packed()` method in `src/google/protobuf/descriptor.cc` drives the encoding decision, while `src/google/protobuf/wire_format.cc` and [`src/google/protobuf/wire_format_lite.h`](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wire_format_lite.h) handle the actual serialization and parsing logic.
- **Wire compatibility** is guaranteed: parsers accept both packed and unpacked data regardless of the field's declared packing status, as verified by `ParsePackedFromUnpacked` and `ParseUnpackedFromPacked` unit tests.

## Frequently Asked Questions

### What is the difference between packed and unpacked repeated fields in Protobuf?

Unpacked encoding writes each element as a separate tag-value pair on the wire, while packed encoding concatenates all elements into a single length-delimited block with one shared tag. Packed encoding eliminates the per-element tag overhead (roughly 1 byte per element) and allows the parser to read the entire array as a contiguous buffer, improving both message size and parsing speed for numeric scalar types.

### Does proto3 use packed encoding by default for repeated fields?

Yes. In proto3, repeated scalar fields automatically use packed encoding as the default behavior. You do not need to specify any options to obtain the space-saving benefits. In contrast, proto2 defaults to unpacked encoding for repeated fields, requiring you to explicitly set `[packed = true]` on the field definition to enable the packed format.

### Are packed and unpacked repeated fields wire-compatible?

Yes, packed and unpacked encodings are fully wire-compatible. A parser expecting packed data can successfully parse unpacked data (reading individual tag-value pairs), and a parser expecting unpacked data can parse packed data (reading the length-delimited block and iterating through its contents). This cross-compatibility is enforced by unit tests `ParsePackedFromUnpacked` and `ParseUnpackedFromPacked` in [`src/google/protobuf/wire_format_unittest.h`](https://github.com/protocolbuffers/protobuf/blob/main/src/google/protobuf/wire_format_unittest.h), allowing you to change the packing option without breaking existing deployments.

### When should I avoid using packed encoding for repeated fields?

Avoid packed encoding when you are using proto2 and need compatibility with protobuf implementations prior to version 2.1.0 (released in 2008), which do not recognize the packed wire format. Additionally, do not use packed encoding for repeated fields containing message types (sub-messages), as packing is only supported for numeric scalar types (integers, floats, booleans, and enums). For all other repeated scalar fields, especially those with many elements, packed encoding is recommended.