Understanding Protobuf Dynamic Messages for Runtime Message Construction

Protobuf dynamic messages enable runtime construction of protocol buffer messages from descriptors without compile-time generated code, using DynamicMessageFactory to cache prototypes and DynamicMessage to store field data with identical memory layout to generated classes.

The protocolbuffers/protobuf repository provides a dynamic message API that allows developers to construct and manipulate protocol buffer messages at runtime using descriptors loaded from .proto files or created programmatically. This capability is essential for applications that need to handle message types unknown at compile time, such as generic proxies, scripting engines, or configuration systems.

Core Architecture

The dynamic message implementation centers on two primary classes in the google::protobuf namespace:

  • DynamicMessageFactory – Maintains a cache of prototype messages for each Descriptor. It creates the prototype on the first request and returns the same object for subsequent requests of the same descriptor.
  • DynamicMessage – A concrete Message implementation that stores field data in a layout identical to generated messages. All reflective operations including parsing, serialization, and field access delegate to the generic GeneratedMessageReflection code.

How Dynamic Messages Are Built

The construction process follows a sophisticated multi-stage pipeline that ensures memory efficiency and compatibility with the reflection system.

Prototype Caching

When DynamicMessageFactory::GetPrototype() receives a Descriptor, it first checks its internal cache. If a prototype exists, it returns immediately; otherwise, it initiates the construction sequence. This caching mechanism ensures that all instances of the same message type share identical metadata and memory layout specifications.

TypeInfo and Memory Layout

The factory creates a DynamicMessageFactory::TypeInfo structure that records critical offset data for the message class. According to the definition in src/google/protobuf/dynamic_message.h (lines 70–90), this structure tracks:

  • Field offsets within the message object
  • The location of the has-bits bitmap for tracking field presence
  • Oneof case word offsets
  • Extension set offsets

In src/google/protobuf/dynamic_message.cc (lines 777–805), the GetPrototypeNoLock() method computes the exact byte size required for the message. It iterates through each field, determining space requirements via FieldSpaceUsed() (e.g., sizeof(int32_t) for primitives, sizeof(RepeatedField<T>) for repeated fields). The algorithm aligns offsets to kSafeAlignment (8 bytes) and inserts padding as needed to match generated message layouts.

Field Construction

Raw memory allocation occurs via internal::Allocate(size), followed by placement-new construction of the DynamicMessage object. The SharedCtor implementation (starting at line 1010 in src/google/protobuf/dynamic_message.cc) uses placement-new to initialize each field in its pre-computed slot, applying default values where appropriate. For example, scalar fields initialize with new (field_ptr) int32_t{field->default_value_int32()}.

Cross-Linking Prototypes

After field construction, DynamicMessage::CrossLinkPrototypes() (lines 1022–1024 in src/google/protobuf/dynamic_message.cc) processes singular message-type fields. It sets pointers to the prototype instances of sub-messages, enabling dynamic messages to share default instances for nested messages—mirroring the behavior of statically generated code.

Runtime Instantiation

The prototype's New() method (inherited from MessageFactory) invokes DynamicMessage::NewImpl, which allocates a fresh memory block using the previously computed size and executes the same placement-new construction logic. This creates mutable message instances that share the same reflection schema as the prototype.

Reflection Integration

All field accessors (GetField, SetField, HasField, etc.) operate through the GeneratedMessageReflection object attached to the prototype via type_info_->class_data.set_reflection(new Reflection(...)). Because the memory layout matches generated messages exactly, the generic reflection code functions without modification for dynamic instances.

Practical Implementation

The following example demonstrates loading a .proto definition at runtime and constructing messages dynamically:

// Load a .proto at runtime
std::string proto_text = R"proto(
  syntax = "proto3";
  message Person {
    string name = 1;
    int32  id   = 2;
    repeated string email = 3;
  }
)proto";

google::protobuf::FileDescriptorProto file_proto;
google::protobuf::compiler::Parser parser;
parser.ParseFromString(proto_text, &file_proto);

google::protobuf::DescriptorPool pool;
const google::protobuf::FileDescriptor* file_desc = pool.BuildFile(file_proto);
const google::protobuf::Descriptor* person_desc = file_desc->FindMessageTypeByName("Person");

// Create factory and obtain prototype
google::protobuf::DynamicMessageFactory factory;
const google::protobuf::Message* prototype = factory.GetPrototype(person_desc);

// Instantiate mutable message
std::unique_ptr<google::protobuf::Message> person(prototype->New());

// Populate fields via Reflection
const google::protobuf::Reflection* refl = person->GetReflection();
const google::protobuf::FieldDescriptor* name_fd   = person_desc->FindFieldByName("name");
const google::protobuf::FieldDescriptor* id_fd     = person_desc->FindFieldByName("id");
const google::protobuf::FieldDescriptor* email_fd  = person_desc->FindFieldByName("email");

refl->SetString(person.get(), name_fd, "Alice");
refl->SetInt32(person.get(), id_fd, 123);
refl->AddString(person.get(), email_fd, "[email protected]");
refl->AddString(person.get(), email_fd, "[email protected]");

// Serialize and parse
std::string serialized = person->SerializeAsString();
std::unique_ptr<google::protobuf::Message> copy(prototype->New());
copy->ParseFromString(serialized);

This pattern supports nested messages, oneof fields, extensions, and map fields transparently through the generic reflection layer.

Performance Characteristics

The dynamic message implementation achieves near-native performance through several optimizations:

  • Cache-Friendly Layout – Offsets are calculated once per descriptor and stored in TypeInfo; every instance uses identical binary layout enabling fast field access and zero-overhead parsing.
  • Prototype Sharing – The factory maintains a single prototype per descriptor, avoiding duplicate metadata and enabling cheap cloning via New().
  • Zero-Copy String Views – When internal::EnableExperimentalMicroString() is enabled, default string fields point directly to shared buffers without copying, as verified in the unit test MicroStringFieldsWithDefaultValuesDontCopyTheDefaultOnCreate.

Summary

  • DynamicMessageFactory caches prototypes per descriptor to ensure consistent memory layout and efficient instance creation.
  • DynamicMessage uses placement-new construction in pre-calculated memory layouts that match generated message formats exactly.
  • Memory alignment follows kSafeAlignment (8 bytes) rules computed in GetPrototypeNoLock() to maintain compatibility with GeneratedMessageReflection.
  • Cross-linking ensures nested message fields share default prototypes, preserving copy-on-write semantics identical to generated code.
  • The reflection API works transparently across both generated and dynamic messages because both use the same underlying layout and GeneratedMessageReflection engine.

Frequently Asked Questions

What is the difference between generated and dynamic protobuf messages?

Generated messages are C++ classes produced by the protoc compiler at build time, offering compile-time type safety and optimized field accessors. Dynamic messages are constructed at runtime from Descriptor objects and accessed exclusively through the reflection API, enabling applications to handle message types not available when the binary was compiled.

How does DynamicMessageFactory improve performance?

DynamicMessageFactory computes memory layouts and reflection metadata once per descriptor, caching the result as a prototype. Subsequent instance creation via prototype->New() reuses this metadata and performs only a memory allocation and placement-new initialization, avoiding repeated layout calculations or descriptor parsing.

Can dynamic messages handle all protobuf features including extensions and oneofs?

Yes. The DynamicMessage implementation supports extensions, oneof fields, maps, and nested messages through the same GeneratedMessageReflection engine used by generated code. The TypeInfo structure explicitly tracks oneof case words and extension set offsets, enabling full feature parity with compiled message types.

Is the memory layout of dynamic messages compatible with generated code?

Yes. The layout algorithm in src/google/protobuf/dynamic_message.cc (lines 777–805) computes field offsets, alignment, and padding to match the exact binary layout of generated messages. This compatibility allows GeneratedMessageReflection to operate on both types without modification, ensuring serialization and parsing behavior remains identical.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →