C API vs C++ API for ONNX Runtime: Key Differences and When to Use Each

The ONNX Runtime C API is a low-level procedural interface requiring manual memory management and error checking via OrtStatus* handles, while the C++ API provides a header-only, object-oriented wrapper with RAII semantics, exception handling (Ort::Exception), and type-safe convenience classes built entirely atop the C layer.

ONNX Runtime ships with dual native interfaces in the microsoft/onnxruntime repository: the canonical onnxruntime_c_api.h provides ABI-stable C symbols for maximum portability, while onnxruntime_cxx_api.h offers a modern C++ experience. Both access identical inference capabilities, but they diverge sharply in programming model, resource safety, and developer ergonomics.

Programming Model and Architecture

C API Design

The C interface exposes functionality through opaque handles (OrtEnv*, OrtSession*, OrtValue*) and a global function table retrieved via OrtGetApiBase()->GetApi(ORT_API_VERSION). Code operates procedurally: you create objects by passing pointer-to-pointer addresses to allocation functions, then manually invoke the matching OrtRelease* functions for cleanup. This design prioritizes ABI stability and serves as the foundation for language bindings (Python, Java, C#).

C++ API Design

The C++ wrapper reimagines the C handles as RAII-enabled classes (Ort::Env, Ort::Session, Ort::Value). Defined entirely in the onnxruntime_cxx_api.h header, these classes encapsulate the raw C pointers and automatically manage their lifetimes. The API uses move-only semantics (via the Base<T> template) to enforce unique ownership, preventing accidental aliasing or double-free errors.

Error Handling Comparison

C API error handling requires explicit checking after every operation. Functions return an OrtStatus* pointer that is nullptr on success; on failure, you must extract error codes and messages via API methods, then manually release the status object with OrtReleaseStatus.

C++ API error handling converts errors into exceptions automatically. The wrapper uses internal macros (visible around line 50-66 of onnxruntime_cxx_api.h) to check C API return values and throw Ort::Exception objects on failure. This eliminates boilerplate status checking and allows error handling via standard C++ try/catch blocks.

Memory Management and Resource Ownership

Aspect C API Approach C++ API Approach
Acquisition Explicit Create functions with handle output parameters Constructors (e.g., Ort::Session{env, model_path, opts})
Release Manual OrtReleaseSession, OrtReleaseValue, etc. Automatic destructor calls via Base<T>::~Base()
Ownership Raw pointers; implicit ownership tracking Move-only Base<T>; Unowned<T> tag for non-owning references
Safety Developer responsible for matching every Create with Release Compiler-enforced RAII; resources freed even if exceptions occur

The Base<T> template (lines 997-1035 of onnxruntime_cxx_api.h in the microsoft/onnxruntime repository) implements move semantics and deleted copy constructors, ensuring that resources like OrtSession or OrtEnv have clear, unique owners.

Type Safety and Convenience Features

C API operates on raw C types (int64_t, float*, char*). Creating tensors requires manual allocation via OrtAllocator and careful management of memory info structs.

C++ API provides rich type wrappers including Ort::Float16_t, Ort::BFloat16_t, and Ort::Float8E4M3FN_t. Helper classes like Ort::MemoryInfo and Ort::Value offer overloaded constructors accepting std::vector and std::string, as well as templates like GetTensorMutableData<T>() for type-safe data access. The C++ API also bundles Ort::AllocatorWithDefaultOptions to simplify tensor creation without manual allocator retrieval.

Code Comparison: Loading and Running a Model

Below are equivalent implementations demonstrating the practical differences between the APIs.

C API Implementation

#include "onnxruntime_c_api.h"

int main() {
  const OrtApi* api = OrtGetApiBase()->GetApi(ORT_API_VERSION);
  
  OrtEnv* env;
  api->CreateEnv(ORT_LOGGING_LEVEL_WARNING, "example", &env);

  OrtSessionOptions* sess_opts;
  api->CreateSessionOptions(&sess_opts);
  api->SetIntraOpNumThreads(sess_opts, 1);

  OrtSession* session;
  api->CreateSession(env, "model.onnx", sess_opts, &session);

  // Prepare input tensor
  OrtAllocator* allocator;
  api->GetAllocatorWithDefaultOptions(&allocator);
  float input_val = 1.0f;
  int64_t dims[1] = {1};
  OrtValue* input_tensor;
  api->CreateTensorWithDataAsOrtValue(allocator, &input_val, sizeof(float),
                                     dims, 1, ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,
                                     &input_tensor);

  const char* input_names[] = {"input"};
  const char* output_names[] = {"output"};
  OrtValue* output_tensor;
  
  api->Run(session, NULL, input_names, &input_tensor, 1,
           output_names, 1, &output_tensor);

  // Extract result
  float* out;
  api->GetTensorMutableData(output_tensor, (void**)&out);
  printf("Result: %f\n", out[0]);

  // Manual cleanup required
  api->ReleaseValue(input_tensor);
  api->ReleaseValue(output_tensor);
  api->ReleaseSession(session);
  api->ReleaseSessionOptions(sess_opts);
  api->ReleaseEnv(env);
}

C++ API Implementation

#include "onnxruntime_cxx_api.h"
#include <vector>
#include <iostream>

int main() {
  Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "example"};
  Ort::SessionOptions opts;
  opts.SetIntraOpNumThreads(1);

  Ort::Session session{env, "model.onnx", opts};

  // Create input tensor from vector
  std::vector<float> input_data = {1.0f};
  Ort::MemoryInfo mem_info = Ort::MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);
  Ort::Value input_tensor = Ort::Value::CreateTensor<float>(
      mem_info, input_data.data(), input_data.size(), 
      std::array<int64_t,1>{1}.data(), 1);

  const char* input_names[] = {"input"};
  const char* output_names[] = {"output"};

  // Run throws on error; returns vector of Ort::Value
  auto outputs = session.Run(Ort::RunOptions{nullptr},
                            input_names, &input_tensor, 1,
                            output_names, 1);

  float* out = outputs.front().GetTensorMutableData<float>();
  std::cout << "Result: " << out[0] << std::endl;
  // Automatic cleanup when variables go out of scope
}

Key distinctions: The C++ version eliminates explicit Release calls, uses std::vector for tensor data, and handles errors via exceptions rather than OrtStatus* checks.

Key Source Files in the Repository

File Path API Layer Key Components
include/onnxruntime/core/session/onnxruntime_c_api.h C API OrtApi function table, OrtStatus*, OrtEnv*, OrtSession* opaque handles
include/onnxruntime/core/session/onnxruntime_cxx_api.h C++ API Ort::Env, Ort::Session, Ort::Base<T> RAII wrapper, exception handling macros
include/onnxruntime/core/session/onnxruntime_float16.h Both Ort::Float16_t, Ort::BFloat16_t type definitions used by C++ API
include/onnxruntime/core/session/experimental_onnxruntime_cxx_api.h C++ API (Extended) Training-specific extensions and experimental C++ features

Summary

  • The C API (onnxruntime_c_api.h) provides ABI-stable, procedural access with manual memory management via OrtRelease* functions and explicit OrtStatus* error checking. Use it for language bindings or when C compatibility is required.
  • The C++ API (onnxruntime_cxx_api.h) is a header-only wrapper offering RAII resource management through Ort::Base<T>, exception-based error handling with Ort::Exception, and type-safe helpers for tensor manipulation.
  • Memory safety differs dramatically: C requires manual tracking of handles, while C++ uses move-only semantics and deterministic destructors.
  • Error handling shifts from manual OrtStatus* checks in C to automatic exception translation in C++.
  • Both APIs ultimately call the same underlying runtime functions; the C++ layer is a thin, compile-time abstraction over the C interface.

Frequently Asked Questions

Can I mix C and C++ API calls in the same project?

Yes. The C++ API classes expose their underlying C handles via methods like Ort::Session::GetSession(). You can extract raw pointers from C++ objects to pass into C API functions, or wrap C handles in C++ classes using the Ort::Unowned<T> type. This interoperability allows gradual migration or use of C-specific utilities within C++ codebases.

Is the C++ API header-only and does it require separate linking?

The C++ wrapper is entirely header-only (onnxruntime_cxx_api.h). You only need to link against the ONNX Runtime shared library that provides the C API symbols. No additional libraries are required for the C++ convenience classes, as they are templates and inline functions that compile directly into your binary.

Which API offers better performance?

Both APIs deliver identical runtime performance because the C++ wrapper generates thin inline calls to the underlying C API. For example, Ort::Session::Run (lines 1246-1263 of onnxruntime_cxx_api.h) simply forwards to the C function pointer stored in the OrtApi table. Any overhead is compile-time abstraction with zero runtime cost in optimized builds.

When should I prefer the C API over the C++ API?

Choose the C API when building language bindings (Python, Java, Go), working in embedded environments with limited C++ runtime support, or requiring strict ABI stability across compiler versions. Choose the C++ API for native C++ applications where RAII safety, exception handling, and type safety are priorities, or when using modern C++ features like move semantics and smart pointers.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →