internals

How GraphTransformerMgr Applies Optimizations in Stages in ONNX Runtime

April 24, 2026 microsoft/onnxruntime ↗

The GraphTransformerMgr drives graph‑level optimizations in ONNX Runtime through a three‑stage pipeline: registration by transformer level, multi‑pass execution with fixed‑point convergence, and state inspection to detect graph modifications.

This deep dive explores the GraphTransformerManager class in the Microsoft ONNX Runtime repository, the central orchestrator that applies graph transformations during inference session initialization. Understanding how this component stages optimizations—registration, iterative application, and modification tracking—helps developers debug optimization passes and implement custom transformers.

Stage 1: Transformer Registration by Optimization Level

Before any optimizations run, transformers must be registered with the manager. The GraphTransformerManager::Register method in onnxruntime/core/optimizer/graph_transformer_mgr.cc associates each transformer with a specific TransformerLevel (Level 1 through Level 4).

The registration logic stores transformers in two structures:

level_to_transformer_map_: A map organizing transformers by their assigned optimization level.
transformers_info_: A name‑based lookup map for quick transformer retrieval.

// onnxruntime/core/optimizer/graph_transformer_mgr.cc (lines 63-74)
common::Status GraphTransformerManager::Register(std::unique_ptr<GraphTransformer> transformer,
                                                 TransformerLevel level) {
  // ... validation logic ...
  level_to_transformer_map_[level].push_back(std::move(transformer));
  // ... name registration in transformers_info_ ...
  return Status::OK();
}

Execution providers (EPs) and the inference session register their specific transformers during construction. For example, CPU‑specific transformers register for Level 2, while basic layout optimizations might target Level 1.

Stage 2: Multi-Pass Execution and Fixed-Point Convergence

When InferenceSession::Initialize prepares the model, it invokes GraphTransformerManager::ApplyTransformers sequentially for each level. This method implements the core execution loop that applies optimizations in stages.

The Execution Loop

The method signature in graph_transformer_mgr.cc (lines 25‑53) shows the interface:

common::Status GraphTransformerManager::ApplyTransformers(Graph& graph,
                                                         TransformerLevel level,
                                                         const logging::Logger& logger) const

The execution follows this controlled flow:

Reset State: Clears the internal _is_graph_modified flag before processing begins.
Level Lookup: Retrieves the vector of transformers for the requested level. If no transformers exist for that level, returns immediately.
Multi‑Pass Iteration: Performs up to steps_ passes over the transformer list (configurable via constructor or SetSteps, typically set to 5).

Within each pass, the manager:

Checks Cancellation: Polls IsLoadCancellationFlagSet to allow user‑initiated aborts without corrupting the graph.
Invokes Transformers: Calls each transformer's Apply method.
Tracks Modifications: If a transformer sets modified = true, the manager marks graph_changed and updates the global _is_graph_modified.
Respects Single‑Run Constraints: Skips transformers that declare ShouldOnlyApplyOnce() on subsequent passes (lines 39‑41 in the source).

Fixed‑Point Termination: After each pass, if no transformer modified the graph (if (!graph_changed) break;), the loop exits early. This fixed‑point behavior prevents unnecessary iterations once the graph stabilizes.

// Conceptual representation of the execution loop
for (int step = 0; step < steps_; ++step) {
  bool graph_changed = false;
  
  for (auto& transformer : level_transformers) {
    if (IsLoadCancellationFlagSet()) break;
    
    bool modified = false;
    ORT_RETURN_IF_ERROR(transformer->Apply(graph, modified, logger));
    
    if (modified) {
      graph_changed = true;
      _is_graph_modified = true;
    }
  }
  
  if (!graph_changed) break;  // Fixed-point reached
}

Stage 3: Detecting Graph Modifications

After ApplyTransformers completes, higher‑level code must know whether the graph structure changed to determine if downstream steps (like kernel selection) need recomputation.

The manager exposes two methods in onnxruntime/core/optimizer/graph_transformer_mgr.cc (lines 55‑61):

IsGraphModified(): Returns the const bool& reference to _is_graph_modified, indicating whether any transformer altered the graph during the current optimization phase.
ClearGraphModified(): Resets the flag to false, typically called before applying a new level of transformers.

This state inspection allows the InferenceSession to optimize its initialization workflow, skipping expensive kernel re‑allocation when the graph remains unchanged.

Practical Implementation Examples

Registering and Applying a Custom Transformer

This example demonstrates creating a manager, registering a custom fusion transformer for Level 2, and applying it:

#include "onnxruntime/core/optimizer/graph_transformer_mgr.h"
#include "onnxruntime/core/optimizer/graph_transformer.h"

class MyFusion : public onnxruntime::GraphTransformer {
 public:
  MyFusion() : GraphTransformer("MyFusion") {}
  
  common::Status Apply(onnxruntime::Graph& graph,
                       bool& modified,
                       const onnxruntime::logging::Logger& logger) const override {
    // ... fusion logic ...
    modified = true;  // Set if graph was altered
    return onnxruntime::common::Status::OK();
  }
};

// Initialize manager with 5 passes (default)
onnxruntime::GraphTransformerManager mgr(/*steps=*/5);

// Register for Level 2 optimizations
auto transformer = std::make_unique<MyFusion>();
mgr.Register(std::move(transformer), onnxruntime::TransformerLevel::Level2);

// Apply to graph
onnxruntime::Graph graph = /* ... */;
onnxruntime::logging::Logger logger = /* ... */;
mgr.ApplyTransformers(graph, onnxruntime::TransformerLevel::Level2, logger);

Session Integration Pattern

The following pattern from InferenceSession shows how the manager orchestrates optimization levels sequentially:

GraphTransformerManager graph_mgr(/*steps=*/5);

// Register EP-specific transformers (e.g., CPU EP)
ort::cpu::RegisterCpuGraphTransformers(graph_mgr);

// Apply each level in order
for (auto level : {TransformerLevel::Level1,
                   TransformerLevel::Level2,
                   TransformerLevel::Level3,
                   TransformerLevel::Level4}) {
  ORT_RETURN_IF_ERROR(graph_mgr.ApplyTransformers(graph_, level, logger_));
  
  // Check if graph changed before proceeding
  if (graph_mgr.IsGraphModified()) {
    // ... trigger kernel re-selection ...
  }
  graph_mgr.ClearGraphModified();
}

Summary

Registration Stage: Transformers are indexed by TransformerLevel in level_to_transformer_map_ via GraphTransformerManager::Register, enabling level‑specific optimization strategies.
Execution Stage: ApplyTransformers runs up to steps_ passes, checking IsLoadCancellationFlagSet for aborts, respecting ShouldOnlyApplyOnce constraints, and breaking early when no modifications occur (fixed‑point convergence).
State Inspection Stage: The _is_graph_modified flag, accessed via IsGraphModified(), reports whether the graph changed, allowing InferenceSession to conditionally recompute kernel assignments.
Bounded Cost: The combination of configurable steps_ and fixed‑point termination ensures optimization costs remain predictable even with aggressive transformer chains.

Frequently Asked Questions

How does GraphTransformerMgr handle user cancellation during optimization?

The manager checks IsLoadCancellationFlagSet() at the start of each pass through the transformer list. If the flag is set, the optimization loop breaks immediately, returning control to the caller without completing remaining passes. This prevents wasted computation when a user aborts model loading.

What is the purpose of the `steps_` parameter in GraphTransformerManager?

The steps_ parameter (set via constructor or SetSteps) defines the maximum number of passes the manager executes over the transformer list for a given level. While the default is typically 5, the fixed‑point detection (if (!graph_changed) break;) usually terminates earlier once no further optimizations apply, bounding the total work while allowing iterative transformations to stabilize.

Why do some transformers only apply once per level?

Transformers that return true from ShouldOnlyApplyOnce() are skipped on subsequent passes within the same ApplyTransformers call. This optimization prevents redundant processing for transformations that deterministically modify the graph in a single pass, improving performance without affecting correctness.

How does the manager communicate that optimizations changed the graph?

After each transformer invocation, the manager checks the modified output parameter. If any transformer modifies the graph, the internal _is_graph_modified flag is set to true. Callers query this state via IsGraphModified() after ApplyTransformers returns to determine if downstream initialization steps (such as kernel allocation) must be re-executed.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:

curl -s "https://instagit.com/install.md"

Add to your MCP client configuration:

{
  "mcpServers": {
    "instagit": {
      "command": "npx",
      "args": ["-y", "instagit@latest"]
    }
  }
}

Ask your agent:

"Use Instagit MCP to understand how microsoft/onnxruntime works."

Works with

Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →