Building Supply Chain Copilots with Databricks MCP Servers: A Complete Implementation Guide

You can build a supply chain copilot that queries live enterprise data by combining the OpenAI Agents SDK with Databricks Managed MCP servers, exposing vector search and Unity Catalog functions as tools that the LLM invokes automatically.

This guide walks through the reference implementation in the openai/openai-cookbook repository, demonstrating how to create a conversational interface that retrieves real-time inventory levels, supplier risk assessments, and delivery forecasts from your Databricks workspace. The architecture leverages the Model Connect Platform (MCP) to bridge OpenAI agents with structured Spark SQL and unstructured vector search, ensuring every response is grounded in current enterprise data.

Architecture Overview

The solution layers an OpenAI Agent on top of Databricks infrastructure through two specialized MCP tool wrappers. According to the source code, the system consists of five core components that handle authentication, tool execution, and streaming delivery.

Core Components

  • DatabricksOAuthClientProvider – Centralizes OAuth token acquisition from the Databricks workspace client, caching credentials for subsequent MCP requests. Located in examples/mcp/building-a-supply-chain-copilot-with-agent-sdk-and-databricks-mcp/databricks_mcp.py.
  • MCP Tool Wrappers – Two Python functions (vector_search and uc_function) that convert agent tool calls into HTTP POST requests against Databricks endpoints. These wrappers live in the same databricks_mcp.py file.
  • OpenAI Agents SDK – Orchestrates tool selection and LLM reasoning. The Agent class is instantiated fresh for each request in main.py and api_server.py.
  • FastAPI Backend – Exposes a streaming /chat endpoint that constructs the agent, injects MCP tools, and returns Server-Sent Events. Implementation found in api_server.py.
  • React Frontend – Consumes the streaming endpoint and renders responses in real-time. Component defined in ui/src/components/ChatUI.jsx.

Authentication and MCP Server Setup

Before the agent can query your supply chain data, the system must authenticate with your Databricks workspace and construct the tool definitions.

Initializing the OAuth Provider

The DatabricksOAuthClientProvider class reads workspace credentials from ~/.databrickscfg and manages token refresh. This pattern ensures the agent always presents a valid bearer token when calling MCP endpoints.

from databricks_mcp import DatabricksOAuthClientProvider
from databricks.sdk import WorkspaceClient

# Reads ~/.databrickscfg automatically

ws = WorkspaceClient()
token = DatabricksOAuthClientProvider(ws).get_token()

Source: databricks_mcp.py, lines 270–285.

Defining Vector Search and Unity Catalog Tools

The MCP wrappers translate agent tool calls into Databricks REST API requests. The vector_search function queries similarity indexes for unstructured data (e.g., supplier contracts), while uc_function executes arbitrary Spark SQL or Python logic defined as Unity Catalog functions.

def vector_search(query: str, top_k: int = 5) -> dict:
    """Query Databricks MCP Vector Search index."""
    url = f"{host}/api/2.0/vector-search/queries"
    payload = {"query": query, "top_k": top_k}
    resp = httpx.post(url, json=payload, headers={"Authorization": f"Bearer {token}"})
    return resp.json()


def uc_function(func_path: str, args: dict) -> dict:
    """Invoke a Databricks Unity Catalog function."""
    url = f"{host}/api/2.0/unity-catalog/functions/{func_path}"
    resp = httpx.post(url, json=args, headers={"Authorization": f"Bearer {token}"})
    return resp.json()

Source: databricks_mcp.py, lines 292–311.

Implementing the FastAPI Chat Backend

The api_server.py file defines the runtime layer that instantiates an Agent per request and streams the conversation back to the client. This design ensures isolation between conversations and allows dynamic tool injection.

Streaming Endpoint Construction

For each incoming POST request, the server builds a fresh Agent instance configured with the MCP tools and a restrictive system prompt that limits queries to supply-chain topics.

@app.post("/chat")
async def chat_endpoint(request: ChatRequest):
    agent = Agent(
        model="gpt-4o-mini",
        mcp_servers=[vector_search_tool, uc_function_tool],
        instructions=INSTRUCTIONS,  # Restricts to supply-chain topics only

    )

    async def stream():
        async for chunk in agent.run(request.message):
            yield f"data: {chunk}\n\n"

    return StreamingResponse(stream(), media_type="text/event-stream")

Source: api_server.py, lines 46–112.

Safety Guardrails

The system prompt referenced in main.py (line 61) explicitly restricts the assistant to supply-chain domains. If a user asks an out-of-scope question, the agent returns a fixed refusal message: "Sorry, I can only help with supply-chain questions."

Building the React Frontend

The frontend component in ChatUI.jsx establishes a Server-Sent Events (SSE) connection to the FastAPI backend and appends streaming chunks to the conversation history.

const eventSource = new EventSource(`${backendUrl}/chat`, {
  method: "POST",
  body: JSON.stringify({ message: userInput })
});

eventSource.onmessage = (e) => {
  setChatLog((log) => [...log, { role: "assistant", content: e.data }]);
};

Source: ui/src/components/ChatUI.jsx, lines 52–68.

The UI applies Databricks brand styling and handles real-time message rendering without blocking the main thread.

Complete Conversation Flow

When a user asks "What is the current inventory level of SKU 123?", the following sequence executes:

  1. Tool Selection – The Agent analyzes the prompt and selects the uc_function tool, mapping the SKU to a Unity Catalog SQL function that queries the inventory fact table.
  2. Execution – The uc_function wrapper POSTs to the Databricks Unity Catalog REST API with the cached OAuth token.
  3. Data Retrieval – Spark SQL executes against the latest Delta Lake table, returning structured JSON.
  4. Response Synthesis – The LLM receives the query results and generates a natural language answer.
  5. Streaming Delivery – FastAPI yields chunks via SSE, and the React UI renders tokens as they arrive.

This pattern repeats for unstructured queries (e.g., "Summarize the risk clause in Supplier X's contract"), where the agent instead invokes vector_search against the Databricks Vector Search index.

Reference File Structure

The openai-cookbook repository contains the following key files for this implementation:

Summary

  • Databricks MCP servers expose enterprise data through two primary interfaces: Vector Search for unstructured similarity queries and Unity Catalog functions for structured SQL/Python execution.
  • Authentication is handled centrally via DatabricksOAuthClientProvider, which caches workspace tokens from ~/.databrickscfg and injects them into every MCP request.
  • The OpenAI Agents SDK treats MCP endpoints as forced tools, ensuring every conversation turn retrieves fresh data rather than relying on stale training knowledge.
  • FastAPI and React provide a production-ready streaming interface, with the backend constructing isolated Agent instances per request for security and state management.
  • Guardrails in the system prompt prevent scope creep, restricting the copilot to supply-chain domains only.

Frequently Asked Questions

How does the copilot maintain data security when querying Databricks?

The implementation relies on Databricks OAuth tokens managed by the DatabricksOAuthClientProvider class. Tokens are sourced from the standard ~/.databrickscfg file and passed as Bearer tokens in every HTTP request header. This ensures all data access respects the workspace's existing IAM policies, Unity Catalog governance rules, and row-level security configurations.

What types of supply chain questions can this architecture answer?

The copilot handles both structured and unstructured queries. For structured data, it invokes Unity Catalog functions to run Spark SQL against inventory, logistics, or procurement tables. For unstructured data, it uses Vector Search to retrieve relevant passages from supplier contracts, shipping documentation, or compliance PDFs stored in Databricks.

Can I use a different LLM model instead of gpt-4o-mini?

Yes. The model parameter in api_server.py and main.py accepts any model identifier supported by the OpenAI Agents SDK. You can substitute gpt-4o, gpt-4-turbo, or other compatible models by modifying the Agent instantiation line while retaining the same MCP server configuration.

Where can I find the complete setup instructions?

The databricks_mcp_cookbook.ipynb notebook in examples/mcp/ provides end-to-end setup guidance, including Databricks workspace configuration, Vector Search index creation, Unity Catalog function definitions, and local environment variables required to run the FastAPI server and React UI.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →