ONNX Runtime Kernel Registry Architecture: Provider-Aware Operator Resolution

ONNX Runtime implements a three-tier kernel registry system centered on provider-scoped KernelRegistry instances, a central KernelRegistryManager for orchestration, and KernelDefBuilder metadata objects that map graph nodes to concrete operator implementations based on version compatibility and type constraints.

The microsoft/onnxruntime repository isolates operator kernel creation behind a sophisticated abstraction layer that binds implementations to specific Execution Providers (EPs). This kernel registry architecture determines which concrete OpKernel executes each node by matching operator metadata against registered capabilities across CPU, CUDA, and custom hardware backends.

Core Components of the Kernel Registry System

The architecture consists of three primary classes that handle registration, storage, and resolution of operator kernels.

KernelRegistry

The KernelRegistry class stores a multimap of kernel definitions keyed by the concatenated string op_name + ' ' + domain + ' ' + provider via the internal GetMapKey method. Located in include/onnxruntime/core/framework/kernel_registry.h (lines 23‑70), this class provides the Register() method for providers to insert kernels and the TryFindKernel() family of methods to match nodes against version constraints and type requirements.

KernelRegistryManager

The KernelRegistryManager aggregates individual registries and controls lookup priority. Defined in core/framework/kernel_registry_manager.h (lines 25‑68), it maintains provider_type_to_registry_ for built-in EPs and custom_kernel_registries_ for runtime extensions. Custom registries receive priority over built-in implementations, enabling users to override default kernels.

KernelDefBuilder and KernelCreateInfo

KernelDefBuilder and KernelCreateInfo provide the metadata DSL and factory storage mechanism. Execution providers use KernelDefBuilder to declare operator support (name, domain, version range, type constraints) and supply factory functions that instantiate concrete OpKernel objects. These definitions populate the registry via KernelRegistry::Register.

Kernel Registration Flow

The system follows a strict three-phase registration process when initializing an inference session:

  1. Kernel Definition Construction: Each EP uses KernelDefBuilder to specify supported operators. The builder captures version ranges, type constraints, and the target provider identifier.

  2. Registry Population: The EP calls KernelRegistry::Register (implemented in core/framework/kernel_registry.cc, lines 8‑30), which stores a KernelCreateInfo struct containing the factory function in the registry's multimap using the composite key format.

  3. Manager Aggregation: KernelRegistryManager::RegisterKernels iterates over all session execution providers, invoking each provider's registration method to populate the built-in registry map. Custom registries can be injected later via RegisterKernelRegistry, which prepends them to custom_kernel_registries_ (lines 59‑63 of kernel_registry_manager.h) for higher priority.

Runtime Kernel Lookup Mechanism

When materializing an execution plan, the runtime resolves each node to a concrete kernel through provider-aware matching:

  1. Registry Search Initiation: The session calls KernelRegistryManager::SearchKernelRegistry (lines 68‑71 of kernel_registry_manager.h), which retrieves candidate registries via GetKernelRegistriesByProviderType using the node's assigned provider.

  2. Candidate Filtering: For each KernelRegistry, the system invokes TryFindKernel, which executes the internal TryFindKernelImpl method. This implementation queries the multimap using the node's op‑type, domain, and expected provider, then iterates through matches obtained via kernel_creator_fn_map_.equal_range.

  3. Compatibility Verification: For each candidate, KernelRegistry::VerifyKernelDef validates version compatibility via VerifyVersion and type constraint matching via MatchKernelDefTypes. The first successful match returns a KernelCreateInfo pointer, which the manager uses to instantiate the kernel via the stored factory function.

Implementing Custom Kernel Registration

Developers can extend ONNX Runtime with custom operators using the public registry APIs.

Registering a Kernel in a Custom Execution Provider

Custom EPs implement the RegisterKernels method to declare supported operators:

// MyCustomExecutionProvider.cpp
#include "core/framework/kernel_registry.h"
#include "core/framework/op_kernel.h"

class MyCustomKernel : public OpKernel {
 public:
  explicit MyCustomKernel(const OpKernelInfo& info) : OpKernel(info) {}
  Status Compute(OpKernelContext* ctx) const override {
    // Custom implementation
    return Status::OK();
  }
};

void MyCustomExecutionProvider::RegisterKernels(KernelRegistry& registry) {
  KernelDefBuilder builder;
  builder.SetName("Relu")
         .SetDomain(kOnnxDomain)
         .SinceVersion(6)
         .Provider(kMyCustomExecutionProvider);

  registry.Register(builder,
      [](const OpKernelInfo& info) { 
        return std::make_unique<MyCustomKernel>(info); 
      });
}

This registration stores the factory in KernelRegistry::Register at core/framework/kernel_registry.cc (lines 8‑30).

Injecting a Custom Registry at Runtime

Users can override built-in kernels by registering custom registries with higher priority:

auto custom_registry = std::make_shared<onnxruntime::KernelRegistry>();
// Register custom kernels...

auto& kr_manager = session_state.GetKernelRegistryManager();
kr_manager.RegisterKernelRegistry(custom_registry);  // Highest priority

The manager stores this in custom_kernel_registries_ (lines 59‑63 of kernel_registry_manager.h), ensuring these kernels are checked before built-in implementations as noted in the comments around lines 36‑48.

Resolving Kernels During Session Execution

The runtime search process follows this pattern:

const onnxruntime::Node& node = ...;
const onnxruntime::logging::Logger& logger = ...;

const onnxruntime::KernelCreateInfo* info = nullptr;
auto status = session_state.GetKernelRegistryManager()
                .SearchKernelRegistry(node, logger, &info);

if (status.IsOK() && info) {
  std::unique_ptr<onnxruntime::OpKernel> kernel;
  status = session_state.GetKernelRegistryManager()
                .CreateKernel(node, *execution_provider, 
                             session_state, *info, kernel);
}

SearchKernelRegistry loops through registries (lines 68‑71 of kernel_registry_manager.h) and delegates to KernelRegistry::TryFindKernel (lines 26‑30 of kernel_registry.cc).

Summary

  • Provider-scoped isolation: Each KernelRegistry binds kernels to specific execution providers using composite keys of op name, domain, and provider type.
  • Priority-based resolution: KernelRegistryManager searches custom registries before built-in ones, enabling kernel overrides without modifying core source.
  • Strict validation: Kernel selection requires matching version ranges (VerifyVersion) and type constraints (MatchKernelDefTypes) defined during registration.
  • Factory-based instantiation: Registries store KernelCreateInfo objects containing factory lambdas that instantiate concrete OpKernel implementations only after successful resolution.

Frequently Asked Questions

How does ONNX Runtime prioritize custom kernels over built-in implementations?

The KernelRegistryManager stores custom registries in custom_kernel_registries_ and searches them before checking provider_type_to_registry_ (the built-in map). As noted in core/framework/kernel_registry_manager.h (lines 36‑48), this ordering ensures that user-provided kernels take precedence over default CPU or CUDA implementations when multiple kernels match the same node signature.

What criteria must match for a kernel to be selected for a graph node?

The lookup algorithm requires five specific matches: the provider string must match the node's assigned execution provider; the op name and domain must align with the kernel definition; the opset version must fall within the kernel's declared range (SinceVersion to UntilVersion); and the input/output types must satisfy the kernel's type constraints. These checks occur in KernelRegistry::VerifyKernelDef within core/framework/kernel_registry.cc.

Where are the core kernel registry classes defined in the source tree?

The primary header include/onnxruntime/core/framework/kernel_registry.h declares KernelRegistry and KernelCreateInfo. The orchestration layer resides in core/framework/kernel_registry_manager.h, while implementation details including Register, TryFindKernel, and validation logic live in core/framework/kernel_registry.cc. Unit tests demonstrating override behavior are available in test/framework/kernel_registry_test.cc.

Can multiple Execution Providers register kernels for the same operator?

Yes, each provider maintains independent registrations within provider_type_to_registry_. The multimap key incorporates the provider string (op_name + ' ' + domain + ' ' + provider), allowing CPU, CUDA, and custom EPs to register distinct implementations for identical ONNX operators. During lookup, only registries matching the node's assigned provider are considered, ensuring the correct hardware-specific kernel is selected.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →