# How Linkis ContextService Manages Unified Variables and UDFs Across Engine Types

> Discover how Linkis ContextService unifies variables and UDFs across engines. Learn about its key-value store, ContextType discrimination, MySQL storage, and in-memory caching for seamless cross-engine sharing.

- Repository: [The Apache Software Foundation/linkis](https://github.com/apache/linkis)
- Tags: deep-dive
- Published: 2026-02-25

---

**The Linkis ContextService treats variables and UDFs as typed context entries, using a unified key-value store with ContextType discrimination to enable cross-engine sharing through persistent MySQL storage and in-memory caching.**

The Apache Linkis project provides a computation middleware layer that standardizes access to diverse big data engines. Its ContextService (CS) component solves the fragmentation problem of variable and UDF management by abstracting these resources as first-class context objects. This article examines how Linkis ContextService manages unified variables and UDFs across different engine types through a type-safe, scope-aware architecture.

## Unified Data Model for Context Entries

### ContextKey and ContextValue Abstraction

Every piece of shared data is wrapped as a context entry composed of a **ContextKey** and **ContextValue**. In [`linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/source/ContextKey.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/source/ContextKey.java), the key encapsulates the identifier string, `ContextType` (VARIABLE or UDF), and `ContextScope` (global or engine-specific). The corresponding `ContextValue` defined in [`linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/source/ContextValue.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/source/ContextValue.java) holds the serialized payload plus optional keyword metadata for search indexing.

### Type Safety with ContextType Enumeration

The **ContextType** enum defined in [`linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/enumeration/ContextType.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/enumeration/ContextType.java) categorizes entries as `VARIABLE`, `UDF`, `METADATA`, or other types. This classification allows the service to apply uniform storage logic while preserving type-specific handling requirements.

### Variable and UDF Storage Strategies

For **variables**, Linkis uses the `LinkisVariable` class (in [`linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/object/LinkisVariable.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/object/LinkisVariable.java)) to wrap configuration parameters or SQL settings. For **UDFs**, the system stores metadata in `UDFInfo` and `UDFVersion` entities (in `linkis-public-enhancements/linkis-udf-service`) while binaries reside in BML (BigData Material Library). The CS holds a reference pointer (resourceId) as the `ContextValue` with `ContextType.UDF`.

## Core Workflow and API Operations

### Storing and Retrieving Context Entries

The `ContextServiceImpl` class in [`linkis-public-enhancements/linkis-cs-server/src/main/java/org/apache/linkis/cs/server/service/impl/ContextServiceImpl.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-cs-server/src/main/java/org/apache/linkis/cs/server/service/impl/ContextServiceImpl.java) handles persistence through `ContextMapPersistenceImpl` and caching via `ContextCacheService`. When invoking `setValueByKey`, the system serializes the value, updates the cache, and writes to MySQL. Retrieval via `getContextValue` checks the cache first, falling back to the database on miss.

### Upstream Variable Search

To collect variables from ancestor nodes in a workflow DAG, `CSVariableService` utilizes `DefaultSearchService` (in [`linkis-public-enhancements/linkis-pes-client/src/main/java/org/apache/linkis/cs/client/service/DefaultSearchService.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-client/src/main/java/org/apache/linkis/cs/client/service/DefaultSearchService.java)). This walks the node lineage and aggregates all entries where `ContextKey.contextType == VARIABLE`, making upstream Spark SQL configurations available to downstream Flink or Hive engines.

### UDF Registration and Resolution

The `UDFServiceImpl` (in [`linkis-public-enhancements/linkis-udf-service/src/main/java/org/apache/linkis/udf/service/impl/UDFServiceImpl.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-udf-service/src/main/java/org/apache/linkis/udf/service/impl/UDFServiceImpl.java)) manages UDF lifecycle, uploading JARs to BML and recording metadata. Engines resolve UDFs by querying the CS for entries with `ContextType.UDF`, then loading the associated binaries from BML using the stored resource references.

## Cross-Engine Propagation Mechanism

### ContextID and Scope Management

Each engine instance operates within a logical **ContextID**. Entries marked with `ContextScope.GLOBAL` are visible to any engine sharing that ID, while scoped entries remain engine-specific. This allows fine-grained control over resource visibility across heterogeneous environments.

### DAG-Aware Value Inheritance

The upstream search logic automatically merges values from parent workflow nodes. An engine receives the latest variable definitions regardless of its concrete implementation (Spark, Flink, Hive, Trino), ensuring consistency across the execution graph.

## Practical Implementation Examples

### Setting a Global Variable

```java
String ctxId = SerializeHelper.serializeContextID(contextID);
String ctxKey = SerializeHelper.serializeContextKey(
    ContextKeyBuilder.builder()
        .key("my.spark.sql.conf")
        .contextType(ContextType.VARIABLE)
        .contextScope(ContextScope.GLOBAL)
        .build());

LinkisVariable var = new LinkisVariable();
var.setValue("spark.sql.shuffle.partitions=200");

ContextClient client = ContextClientFactory.getOrCreateContextClient();
client.update(
    SerializeHelper.deserializeContextID(ctxId),
    SerializeHelper.deserializeContextKey(ctxKey),
    new CommonContextValue(var));

```

### Retrieving Upstream Variables

```java
List<LinkisVariable> vars = CSVariableService.getInstance()
        .getUpstreamVariables(contextIDStr, nodeName);
for (LinkisVariable v : vars) {
    System.out.println("Upstream var: " + v.getValue());
}

```

### Registering a Cross-Engine UDF

```java
UDFAddVo addVo = new UDFAddVo();
addVo.setUdfName("myUdf");
addVo.setUdfType(ConstantVar.UDF_JAR);
addVo.setPath("hdfs:///udfs/myUdf.jar");
addVo.setRegisterFormat("CREATE FUNCTION myUdf AS 'com.example.MyUdf' USING JAR 'myUdf.jar'");
addVo.setLoad(true);

UDFService udfService = SpringContextUtil.getBean(UDFService.class);
Long udfId = udfService.addUDF(addVo, "alice");

// Create context entry for cross-engine visibility
ContextKey udfKey = new DefaultContextKey("myUdf", ContextType.UDF, ContextScope.GLOBAL);
ContextValue udfValue = new CommonContextValue();
udfValue.setValue(udfId);
contextClient.update(contextID, udfKey, udfValue);

```

### Removing Typed Entries by Prefix

```java
contextService.removeAllValueByKeyPrefixAndContextType(
        contextID, ContextType.UDF, "myUdf");

```

## Summary

- The **ContextService** abstracts variables and UDFs as unified context entries using `ContextKey` and `ContextValue` pairs distinguished by `ContextType`.
- **Cross-engine sharing** works through `ContextID` scoping and `ContextScope.GLOBAL` visibility, supported by DAG-aware upstream search in `DefaultSearchService`.
- The architecture combines **MySQL persistence** (`ContextMapPersistenceImpl`) with **in-memory caching** (`ContextCacheService`) for high-performance access.
- UDF binaries reside in **BML** while the CS stores metadata references, enabling any engine to load functions on demand using `UDFServiceImpl`.
- Operations are exposed via REST (`ContextRestfulApi`) and Java client (`ContextClient`) interfaces.

## Frequently Asked Questions

### How does Linkis distinguish between variables and UDFs in the ContextService?

The service uses the `ContextType` enum defined in [`linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/enumeration/ContextType.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/cs/common/entity/enumeration/ContextType.java). When creating a `ContextKey`, developers specify `ContextType.VARIABLE` for configuration parameters or `ContextType.UDF` for user-defined functions. This type tag determines serialization logic and search filters while both entry types share the same underlying key-value infrastructure in `ContextServiceImpl`.

### Can a UDF registered in Spark be used in Hive or Flink through the ContextService?

Yes. When a UDF is registered via `UDFServiceImpl`, it uploads the JAR to BML and stores a reference in the ContextService with `ContextScope.GLOBAL`. Any engine operating under the same `ContextID` can query this entry, retrieve the BML resourceId, and load the JAR locally. This design enables true cross-engine UDF reuse without duplicate registrations.

### What happens when upstream variables change during workflow execution?

The `CSVariableService.getUpstreamVariables` method triggers `DefaultSearchService` to perform a DAG traversal. It aggregates all `ContextValue` objects from ancestor nodes where `ContextType` equals `VARIABLE`. Engines receive the latest values at runtime, ensuring downstream tasks use current configurations regardless of when they were originally set.

### Where does the ContextService store the actual UDF JAR files?

The binary artifacts are stored in Linkis BML (BigData Material Library), not directly in the ContextService. The CS entry (`ContextValue`) contains a resourceId pointer to the BML location. This separation allows the ContextService to remain lightweight while BML handles large binary storage and distribution, as implemented in the `UDFServiceImpl` workflow.