# How GraphTransformerMgr Applies Optimizations in Stages in ONNX Runtime

> Discover how GraphTransformerMgr optimizes ONNX Runtime graphs in three stages: registration, multi-pass execution, and state inspection. Learn about its advanced pipeline.

- Repository: [Microsoft/onnxruntime](https://github.com/microsoft/onnxruntime)
- Tags: internals
- Published: 2026-04-24

---

**The GraphTransformerMgr drives graph‑level optimizations in ONNX Runtime through a three‑stage pipeline: registration by transformer level, multi‑pass execution with fixed‑point convergence, and state inspection to detect graph modifications.**

This deep dive explores the `GraphTransformerManager` class in the Microsoft ONNX Runtime repository, the central orchestrator that applies graph transformations during inference session initialization. Understanding how this component stages optimizations—registration, iterative application, and modification tracking—helps developers debug optimization passes and implement custom transformers.

## Stage 1: Transformer Registration by Optimization Level

Before any optimizations run, transformers must be registered with the manager. The `GraphTransformerManager::Register` method in `onnxruntime/core/optimizer/graph_transformer_mgr.cc` associates each transformer with a specific `TransformerLevel` (Level 1 through Level 4).

The registration logic stores transformers in two structures:

- **`level_to_transformer_map_`**: A map organizing transformers by their assigned optimization level.
- **`transformers_info_`**: A name‑based lookup map for quick transformer retrieval.

```cpp
// onnxruntime/core/optimizer/graph_transformer_mgr.cc (lines 63-74)
common::Status GraphTransformerManager::Register(std::unique_ptr<GraphTransformer> transformer,
                                                 TransformerLevel level) {
  // ... validation logic ...
  level_to_transformer_map_[level].push_back(std::move(transformer));
  // ... name registration in transformers_info_ ...
  return Status::OK();
}

```

Execution providers (EPs) and the inference session register their specific transformers during construction. For example, CPU‑specific transformers register for Level 2, while basic layout optimizations might target Level 1.

## Stage 2: Multi-Pass Execution and Fixed-Point Convergence

When `InferenceSession::Initialize` prepares the model, it invokes `GraphTransformerManager::ApplyTransformers` sequentially for each level. This method implements the core execution loop that applies optimizations in stages.

### The Execution Loop

The method signature in `graph_transformer_mgr.cc` (lines 25‑53) shows the interface:

```cpp
common::Status GraphTransformerManager::ApplyTransformers(Graph& graph,
                                                         TransformerLevel level,
                                                         const logging::Logger& logger) const

```

The execution follows this controlled flow:

1. **Reset State**: Clears the internal `_is_graph_modified` flag before processing begins.

2. **Level Lookup**: Retrieves the vector of transformers for the requested `level`. If no transformers exist for that level, returns immediately.

3. **Multi‑Pass Iteration**: Performs up to `steps_` passes over the transformer list (configurable via constructor or `SetSteps`, typically set to 5).

Within each pass, the manager:
- **Checks Cancellation**: Polls `IsLoadCancellationFlagSet` to allow user‑initiated aborts without corrupting the graph.
- **Invokes Transformers**: Calls each transformer's `Apply` method.
- **Tracks Modifications**: If a transformer sets `modified = true`, the manager marks `graph_changed` and updates the global `_is_graph_modified`.
- **Respects Single‑Run Constraints**: Skips transformers that declare `ShouldOnlyApplyOnce()` on subsequent passes (lines 39‑41 in the source).

4. **Fixed‑Point Termination**: After each pass, if **no transformer modified the graph** (`if (!graph_changed) break;`), the loop exits early. This fixed‑point behavior prevents unnecessary iterations once the graph stabilizes.

```cpp
// Conceptual representation of the execution loop
for (int step = 0; step < steps_; ++step) {
  bool graph_changed = false;
  
  for (auto& transformer : level_transformers) {
    if (IsLoadCancellationFlagSet()) break;
    
    bool modified = false;
    ORT_RETURN_IF_ERROR(transformer->Apply(graph, modified, logger));
    
    if (modified) {
      graph_changed = true;
      _is_graph_modified = true;
    }
  }
  
  if (!graph_changed) break;  // Fixed-point reached
}

```

## Stage 3: Detecting Graph Modifications

After `ApplyTransformers` completes, higher‑level code must know whether the graph structure changed to determine if downstream steps (like kernel selection) need recomputation.

The manager exposes two methods in `onnxruntime/core/optimizer/graph_transformer_mgr.cc` (lines 55‑61):

- **`IsGraphModified()`**: Returns the `const bool&` reference to `_is_graph_modified`, indicating whether any transformer altered the graph during the current optimization phase.
- **`ClearGraphModified()`**: Resets the flag to `false`, typically called before applying a new level of transformers.

This state inspection allows the `InferenceSession` to optimize its initialization workflow, skipping expensive kernel re‑allocation when the graph remains unchanged.

## Practical Implementation Examples

### Registering and Applying a Custom Transformer

This example demonstrates creating a manager, registering a custom fusion transformer for Level 2, and applying it:

```cpp
#include "onnxruntime/core/optimizer/graph_transformer_mgr.h"
#include "onnxruntime/core/optimizer/graph_transformer.h"

class MyFusion : public onnxruntime::GraphTransformer {
 public:
  MyFusion() : GraphTransformer("MyFusion") {}
  
  common::Status Apply(onnxruntime::Graph& graph,
                       bool& modified,
                       const onnxruntime::logging::Logger& logger) const override {
    // ... fusion logic ...
    modified = true;  // Set if graph was altered
    return onnxruntime::common::Status::OK();
  }
};

// Initialize manager with 5 passes (default)
onnxruntime::GraphTransformerManager mgr(/*steps=*/5);

// Register for Level 2 optimizations
auto transformer = std::make_unique<MyFusion>();
mgr.Register(std::move(transformer), onnxruntime::TransformerLevel::Level2);

// Apply to graph
onnxruntime::Graph graph = /* ... */;
onnxruntime::logging::Logger logger = /* ... */;
mgr.ApplyTransformers(graph, onnxruntime::TransformerLevel::Level2, logger);

```

### Session Integration Pattern

The following pattern from `InferenceSession` shows how the manager orchestrates optimization levels sequentially:

```cpp
GraphTransformerManager graph_mgr(/*steps=*/5);

// Register EP-specific transformers (e.g., CPU EP)
ort::cpu::RegisterCpuGraphTransformers(graph_mgr);

// Apply each level in order
for (auto level : {TransformerLevel::Level1,
                   TransformerLevel::Level2,
                   TransformerLevel::Level3,
                   TransformerLevel::Level4}) {
  ORT_RETURN_IF_ERROR(graph_mgr.ApplyTransformers(graph_, level, logger_));
  
  // Check if graph changed before proceeding
  if (graph_mgr.IsGraphModified()) {
    // ... trigger kernel re-selection ...
  }
  graph_mgr.ClearGraphModified();
}

```

## Summary

- **Registration Stage**: Transformers are indexed by `TransformerLevel` in `level_to_transformer_map_` via `GraphTransformerManager::Register`, enabling level‑specific optimization strategies.
- **Execution Stage**: `ApplyTransformers` runs up to `steps_` passes, checking `IsLoadCancellationFlagSet` for aborts, respecting `ShouldOnlyApplyOnce` constraints, and breaking early when no modifications occur (fixed‑point convergence).
- **State Inspection Stage**: The `_is_graph_modified` flag, accessed via `IsGraphModified()`, reports whether the graph changed, allowing `InferenceSession` to conditionally recompute kernel assignments.
- **Bounded Cost**: The combination of configurable `steps_` and fixed‑point termination ensures optimization costs remain predictable even with aggressive transformer chains.

## Frequently Asked Questions

### How does GraphTransformerMgr handle user cancellation during optimization?

The manager checks `IsLoadCancellationFlagSet()` at the start of each pass through the transformer list. If the flag is set, the optimization loop breaks immediately, returning control to the caller without completing remaining passes. This prevents wasted computation when a user aborts model loading.

### What is the purpose of the `steps_` parameter in GraphTransformerManager?

The `steps_` parameter (set via constructor or `SetSteps`) defines the maximum number of passes the manager executes over the transformer list for a given level. While the default is typically 5, the fixed‑point detection (`if (!graph_changed) break;`) usually terminates earlier once no further optimizations apply, bounding the total work while allowing iterative transformations to stabilize.

### Why do some transformers only apply once per level?

Transformers that return `true` from `ShouldOnlyApplyOnce()` are skipped on subsequent passes within the same `ApplyTransformers` call. This optimization prevents redundant processing for transformations that deterministically modify the graph in a single pass, improving performance without affecting correctness.

### How does the manager communicate that optimizations changed the graph?

After each transformer invocation, the manager checks the `modified` output parameter. If any transformer modifies the graph, the internal `_is_graph_modified` flag is set to `true`. Callers query this state via `IsGraphModified()` after `ApplyTransformers` returns to determine if downstream initialization steps (such as kernel allocation) must be re-executed.