# How ONNX Runtime Handles Dynamic Shapes During Inference: Architecture and API Guide

> Learn how ONNX Runtime handles dynamic shapes during inference with its system for symbolic shape representation, session option overrides, and provider-specific shape propagation for variable batch sizes and sequence lengths.

- Repository: [Microsoft/onnxruntime](https://github.com/microsoft/onnxruntime)
- Tags: architecture
- Published: 2026-04-24

---

**ONNX Runtime resolves dynamic (symbolic) dimensions at runtime through a three-layer system involving symbolic shape representation, user-configurable session option overrides, and provider-specific shape propagation, enabling inference on models with variable batch sizes and sequence lengths.**

ONNX Runtime (ORT) treats dynamic dimensions as first-class entities that persist from model loading through inference. This capability allows the `microsoft/onnxruntime` engine to execute models exported with `-1` or named symbolic dimensions (e.g., `"batch"` or `"seq_len"`) without static recompilation. The implementation spans from the C++ `TensorShape` class in `core/framework/tensor_type_and_shape.cc` to execution-provider-specific optimizations in TensorRT and WebNN backends.

## Symbolic Shape Representation in TensorShape

When parsing an ONNX model, ORT encounters dimensions marked with `-1` or explicit symbolic names. The runtime stores these as **symbolic dimensions** within the `TensorShape` class, maintaining both the placeholder value and optional human-readable identifiers.

In `core/framework/tensor_type_and_shape.cc` (lines 106-119), the implementation captures:

- **Denotation**: The raw `-1` placeholder representing an unknown dimension size
- **Symbolic names**: User-friendly identifiers like `"batch"` or `"seq_len"` stored in `dim_params`

This dual representation allows the shape inference engine to propagate symbolic information through the graph even when concrete sizes remain unknown. Operators such as `Reshape` and `MatMul` read these symbolic dimensions to compute output shapes without requiring static allocation during model load.

## Runtime Override via SessionOptions

Before session initialization, users bind concrete values to symbolic dimensions using the `SessionOptions` API. This mechanism bridges the gap between symbolic model definitions and the static requirements of certain hardware backends.

The key methods are:

- **`AddFreeDimensionOverride`**: Binds a concrete size to a dimension denotation (the `-1` placeholder)
- **`AddFreeDimensionOverrideByName`**: Binds a concrete size to a named symbolic dimension (e.g., `"batch"`)

These overrides are stored internally in `OrtSessionOptions::free_dimension_overrides`, implemented in `core/session/abi_session_options.cc` (lines 30-42). The C API exposes these through `OrtAddFreeDimensionOverride` and `OrtAddFreeDimensionOverrideByName`, registered in `core/session/onnxruntime_c_api.cc` at line 4337.

## Shape Propagation and Provider Handling

During graph initialization, ORT executes **symbolic shape inference** using the same engine employed for static shapes. The process resolves dimensions through the following flow:

1. **Graph partition**: The inference engine identifies operators with dynamic inputs
2. **Override application**: When a dimension has a registered override, `TensorShape::IsDynamic()` returns `false` and the concrete size propagates through downstream nodes
3. **Memory allocation**: Execution providers allocate buffers **once actual dimensions become known**, either from overrides or runtime input tensors

Execution providers handle dynamic shapes differently based on backend capabilities:

- **CPU**: Full dynamic shape support using the generic shape inference engine
- **CUDA/TensorRT**: Creates dynamic input profiles on-the-fly when overrides are missing, falling back to optimization profiles for unknown ranges (see `core/providers/tensorrt/tensorrt_execution_provider.cc`, lines 3202-3220)
- **WebNN**: Does not support dynamic shapes; requires explicit overrides via `sessionOptions.freeDimensionOverrides` to avoid runtime errors (see `core/providers/webnn/builders/helper.cc`, lines 87-90)
- **OpenVINO**: Materializes symbolic shapes into static dimensions using overrides before compilation (`core/providers/openvino/backend_manager.cc`)

## Setting Dynamic Shape Overrides: Code Examples

### Python API

Use `add_free_dimension_override_by_name` to bind symbolic names before creating the `InferenceSession`:

```python
import onnxruntime as ort
import numpy as np

# Configure session options with concrete dimension values

options = ort.SessionOptions()
options.add_free_dimension_override_by_name("batch", 2)
options.add_free_dimension_override_by_name("seq_len", 5)

# Load model with symbolic dimensions [-1, -1, 256]

sess = ort.InferenceSession("model_with_dynamic.onnx", sess_options=options)

# Create input matching the overridden shapes

input_data = np.random.randn(2, 5, 256).astype(np.float32)
outputs = sess.run(None, {"input": input_data})
print("Output shape:", outputs[0].shape)

```

The Python wrapper calls the C-API implementation at `python/onnxruntime.capi.cc` (line 215), forwarding to `AddFreeDimensionOverrideByName` in `abi_session_options.cc`.

### C++ API

For low-level integration, use the `OrtSessionOptions` directly:

```cpp
#include "onnxruntime_c_api.h"
#include <iostream>
#include <vector>

int main() {
  Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "test"};
  Ort::SessionOptions opts;
  
  // Bind symbolic "seq_len" to concrete size 8
  Ort::ThrowOnError(OrtAddFreeDimensionOverrideByName(opts, "seq_len", 8));
  
  // Load model with shape [-1, seq_len, 128]
  Ort::Session session{env, "model_dynamic.onnx", opts};
  
  // Prepare concrete tensor (batch=1, seq_len=8, features=128)
  std::vector<int64_t> dims = {1, 8, 128};
  std::vector<float> data(1 * 8 * 128, 1.0f);
  
  Ort::MemoryInfo mem_info = Ort::MemoryInfo::CreateCpu(
      OrtArenaAllocator, OrtMemTypeDefault);
  Ort::Value input_tensor = Ort::Value::CreateTensor<float>(
      mem_info, data.data(), data.size(), dims.data(), dims.size());
  
  const char* input_names[] = {"input"};
  const char* output_names[] = {"output"};
  auto outputs = session.Run(Ort::RunOptions{nullptr}, 
      input_names, &input_tensor, 1, output_names, 1);
}

```

### Inspecting Symbolic Dimensions

Retrieve symbolic dimension names at runtime to verify model capabilities:

```python
info = sess.get_inputs()[0].type_and_shape
print("Symbolic dimensions:", info.get_symbolic_dimensions())  # ['batch', 'seq_len']

```

This calls the implementation in `tensor_type_and_shape.cc` (lines 106-119), which populates symbolic names from the `dim_params` stored in the ONNX type information.

## Summary

- **Symbolic representation**: ONNX Runtime stores dynamic dimensions using `-1` placeholders and optional names in `TensorShape`, parsed from model metadata in `tensor_type_and_shape.cc`.
- **User overrides**: The `AddFreeDimensionOverrideByName` API in `abi_session_options.cc` allows binding concrete values to symbolic names before session creation, stored in `free_dimension_overrides`.
- **Provider flexibility**: CPU and TensorRT providers handle truly dynamic shapes through lazy allocation and optimization profiles, while WebNN requires static overrides to prevent runtime failures.
- **API consistency**: Both Python and C++ interfaces ultimately call the C-API functions registered in `onnxruntime_c_api.cc`, ensuring uniform behavior across language bindings.

## Frequently Asked Questions

### What is the difference between `AddFreeDimensionOverride` and `AddFreeDimensionOverrideByName`?

**`AddFreeDimensionOverride`** targets dimension **denotations** (the raw `-1` values without specific identifiers), while **`AddFreeDimensionOverrideByName`** targets **symbolic names** (human-readable strings like `"batch"` or `"seq_len"` embedded by the model exporter). Use the latter when your ONNX model contains named dimensions; use the former when working with unnamed dynamic dimensions or when the specific axis position is known but not labeled.

### Can I run inference without providing dimension overrides?

**Yes**, provided you use an execution provider that supports dynamic shapes, such as the CPU or TensorRT providers. These allocate memory lazily once the concrete input dimensions arrive at inference time. However, providers like WebNN **require** overrides because they compile static graphs and cannot handle runtime dimension variability. Without overrides on incompatible providers, ONNX Runtime raises an `ORT_INVALID_ARGUMENT` error during session creation.

### Which execution providers support fully dynamic shapes?

**The CPU execution provider offers full dynamic shape support**, handling arbitrary dimension changes between inference calls. **TensorRT and CUDA** support dynamic shapes but may require optimization profiles for performance optimization when dimensions vary. **OpenVINO** materializes symbolic shapes before compilation, requiring overrides for truly dynamic behavior. **WebNN** does not support dynamic shapes and mandates the use of `AddFreeDimensionOverrideByName` to create a static execution plan.

### How does ONNX Runtime handle shape mismatches at runtime?

**ONNX Runtime validates input tensor shapes against the computed graph dimensions at each inference call.** If an input shape contradicts a previously established override (e.g., providing a batch size of 8 when `"batch"` was overridden to 4), the runtime raises `ORT_INVALID_ARGUMENT`. For providers supporting dynamic shapes, providing different concrete dimensions across calls triggers re-inference of shapes on-the-fly, though this may incur overhead as providers reallocate buffers or rebuild optimization profiles.