# onnxruntime | Microsoft | Knowledge Base | Instagit

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

GitHub Stars: 20.3k

Repository: https://github.com/microsoft/onnxruntime

---

## Articles

### [How TrainingSession Manages Checkpoint State and Training Progress in ONNX Runtime](/microsoft/onnxruntime/how-does-trainingsession-manage-checkpoint-state-and-training-progress)

Discover how ONNX Runtime TrainingSession manages checkpoint state and training progress with SetStateTensors and SaveCheckpoint APIs for seamless pause and resume.

- Tags: internals
- Published: 2026-04-24

### [ONNX Runtime Model Partitioning Strategy Across Execution Providers: A Deep Dive](/microsoft/onnxruntime/what-is-the-model-partitioning-strategy-across-execution-providers)

Explore ONNX Runtime's model partitioning strategy. Learn how it intelligently assigns model subgraphs to optimal execution providers for efficient hardware utilization.

- Tags: deep-dive
- Published: 2026-04-24

### [How to Implement Custom Operators in ONNX Runtime Using the Custom Operator API](/microsoft/onnxruntime/how-to-implement-custom-operators-using-the-custom-operator-api)

Learn to implement custom operators in ONNX Runtime using ORT custom operator API. Define metadata, implement compute logic, and register your domain for seamless integration.

- Tags: how-to-guide
- Published: 2026-04-24

### [Optimizing ONNX Runtime Inference Latency: 8 Best Practices for Sub-Millisecond Serving](/microsoft/onnxruntime/what-are-the-best-practices-for-optimizing-onnx-runtime-inference-latency)

Optimize ONNX Runtime inference latency with 8 best practices for sub millisecond serving. Learn graph optimizations, IO binding, hardware acceleration, and threading for faster AI.

- Tags: best-practices
- Published: 2026-04-24

### [How to Resolve Kernel Registration Conflicts with Multiple Execution Providers in ONNX Runtime](/microsoft/onnxruntime/how-to-resolve-kernel-registration-conflicts-with-multiple-execution-providers)

Resolve ONNX Runtime kernel registration conflicts with multiple execution providers. Learn how to ensure unique types, avoid version overlaps, and register providers correctly.

- Tags: how-to-guide
- Published: 2026-04-24

### [C API vs C++ API for ONNX Runtime: Key Differences and When to Use Each](/microsoft/onnxruntime/what-are-the-differences-between-the-c-api-and-c-api-for-onnx-runtime)

Explore ONNX Runtime C API vs C++ API differences. Understand manual memory management in C API and RAII benefits in the C++ API for efficient model deployment.

- Tags: api-reference
- Published: 2026-04-24

### [How Mixed Precision Training Works in ONNX Runtime: Architecture and Implementation](/microsoft/onnxruntime/how-does-mixed-precision-training-work-in-onnx-runtime)

Discover how ONNX Runtime implements mixed precision training. Learn about graph transformations, FP16/BF16 casts, stability graphs, and loss scaling for efficient training.

- Tags: architecture
- Published: 2026-04-24

### [ONNX Runtime Kernel Registry Architecture: Provider-Aware Operator Resolution](/microsoft/onnxruntime/what-is-the-architecture-of-onnx-runtimes-kernel-registry-system)

Explore the ONNX Runtime kernel registry architecture. Understand provider-aware operator resolution with KernelRegistry, KernelRegistryManager, and KernelDefBuilder for efficient graph execution.

- Tags: architecture
- Published: 2026-04-24

### [How to Handle Operator Compatibility When Migrating Between ONNX Runtime Versions](/microsoft/onnxruntime/how-to-handle-operator-compatibility-when-migrating-between-onnx-runtime-versions)

Learn to manage operator compatibility when migrating ONNX Runtime versions. Discover how ONNX Runtime ensures backward compatibility and controls opset validation for smooth transitions.

- Tags: migration-guide
- Published: 2026-04-24

### [ONNX Runtime Threading Model: Thread-Pool-Based Parallel Execution Explained](/microsoft/onnxruntime/what-threading-model-does-onnx-runtime-use-for-parallel-execution)

Discover ONNX Runtime's thread-pool-based threading model for parallel execution. Learn how it leverages Eigen or OpenMP for efficient operator parallelization.

- Tags: internals
- Published: 2026-04-24

### [How GraphTransformerMgr Applies Optimizations in Stages in ONNX Runtime](/microsoft/onnxruntime/how-does-the-graphtransformer-mgr-apply-optimizations-in-stages)

Discover how GraphTransformerMgr optimizes ONNX Runtime graphs in three stages: registration, multi-pass execution, and state inspection. Learn about its advanced pipeline.

- Tags: internals
- Published: 2026-04-24

### [ONNX Runtime Profiling Tools: A Complete Guide to Performance Analysis](/microsoft/onnxruntime/what-profiling-tools-are-available-in-onnx-runtime-for-performance-analysis)

Master ONNX Runtime performance analysis with our complete guide. Learn to use profiling tools for detailed session, operator, and execution provider timing data to optimize your models.

- Tags: how-to-guide
- Published: 2026-04-24

### [How to Use Arena-Based Allocation for Memory Management in ONNX Runtime](/microsoft/onnxruntime/how-to-use-arena-based-allocation-for-memory-management-in-onnx-runtime)

Master arena-based allocation in ONNX Runtime. Learn how to optimize memory management and boost performance by pooling device memory with the BFC arena allocator.

- Tags: internals
- Published: 2026-04-24

### [What Are ONNX Runtime Contrib Operators and How to Register Custom Ones?](/microsoft/onnxruntime/what-are-contrib-operators-and-how-to-register-custom-ones)

Explore ONNX Runtime contrib operators and learn to register custom ones. Extend ONNX functionality with user-defined kernels using the Ort::CustomOpDomain API.

- Tags: how-to-guide
- Published: 2026-04-24

### [How ONNX Runtime Handles Dynamic Shapes During Inference: Architecture and API Guide](/microsoft/onnxruntime/how-does-onnx-runtime-handle-dynamic-shapes-during-inference)

Learn how ONNX Runtime handles dynamic shapes during inference with its system for symbolic shape representation, session option overrides, and provider-specific shape propagation for variable batch sizes and sequence lengths.

- Tags: architecture
- Published: 2026-04-24

### [Understanding the Relationship Between OrtValue, OrtTensor, and OrtMemoryInfo in ONNX Runtime](/microsoft/onnxruntime/what-is-the-relationship-between-ortvalue-orttensor-and-ortmemoryinfo)

Explore the ONNX Runtime relationship between OrtValue OrtTensor and OrtMemoryInfo. Learn how OrtValue containers hold tensors and OrtMemoryInfo defines their memory location.

- Tags: internals
- Published: 2026-04-24

### [How to Configure Graph Optimization Levels in ONNX Runtime SessionOptions](/microsoft/onnxruntime/how-to-configure-graph-optimization-levels-in-sessionoptions)

Learn how to configure graph optimization levels in ONNX Runtime SessionOptions. Control transformation aggressiveness from 0 to 99 in Python or C++ for faster inference.

- Tags: how-to-guide
- Published: 2026-04-24

### [How Memory Planning Works in ONNX Runtime’s OptimizerExecutionFrame](/microsoft/onnxruntime/how-does-memory-planning-work-in-onnx-runtimes-optimizerexecutionframe)

Discover how ONNX Runtime's OptimizerExecutionFrame enhances performance by automatically planning and caching memory patterns, eliminating allocation overhead for faster subsequent executions.

- Tags: internals
- Published: 2026-04-24

### [ONNX Runtime Quantization Formats for Inference Optimization: QDQ vs QOperator](/microsoft/onnxruntime/what-quantization-formats-does-onnx-runtime-support-for-inference-optimization)

ONNX Runtime offers QDQ and QOperator quantization for faster inference. Learn how these formats optimize your models and boost performance.

- Tags: deep-dive
- Published: 2026-04-24

### [How to Debug Model Loading Failures in ONNX Runtime: Complete Error Handling Guide](/microsoft/onnxruntime/how-to-debug-model-loading-failures-using-onnx-runtimes-error-handling)

Debug ONNX Runtime model loading failures with its comprehensive error handling. Understand status objects for precise diagnosis of file, protobuf, or session config issues.

- Tags: how-to-guide
- Published: 2026-04-24

### [CUDA vs TensorRT Execution Providers in ONNX Runtime: Key Differences and When to Use Each](/microsoft/onnxruntime/what-are-the-differences-between-cuda-and-tensorrt-execution-providers-in-onnx-runtime)

Compare CUDA vs TensorRT execution providers in ONNX Runtime. Learn how CUDA uses GPU kernels and TensorRT optimizes sub-graphs for faster inference. Choose the right provider for your needs.

- Tags: deep-dive
- Published: 2026-04-24

### [How to Implement a Custom Kernel in ONNX Runtime's CPU Execution Provider](/microsoft/onnxruntime/how-to-implement-a-custom-kernel-in-onnx-runtimes-cpu-provider)

Learn to implement a custom kernel in ONNX Runtime's CPU Execution Provider. Create an OpKernel subclass, define it with KernelDefBuilder, and register it to extend ONNX Runtime's capabilities.

- Tags: how-to-guide
- Published: 2026-04-24

### [Graph Optimizations in ONNX Runtime: The Complete Transformer Pipeline Explained](/microsoft/onnxruntime/what-graph-optimizations-does-onnx-runtime-apply-during-model-loading)

Discover ONNX Runtime graph optimizations. Learn how Level 1 and Level 2 rewrite rules boost transformer pipeline performance during model loading via SessionOptions.

- Tags: deep-dive
- Published: 2026-04-24

### [How ONNX Runtime's Execution Provider Interface Works: A Deep Dive into Hardware Abstraction](/microsoft/onnxruntime/how-does-onnx-runtimes-execution-provider-interface-work)

Understand ONNX Runtime's Execution Provider interface discover how it abstracts hardware for CPUs GPUs and custom accelerators Learn about plug-in kernels memory allocators and data transfer utilities

- Tags: deep-dive
- Published: 2026-04-24

