Multi-Level Label Architectures for Task Routing in Apache Linkis: Load Balancing and Traffic Control
Apache Linkis employs a hierarchical six-level label system that combines user identity, engine specifications, and traffic-control metadata to route tasks intelligently, enabling precise load balancing and resource isolation across distributed compute clusters.
Apache Linkis utilizes a sophisticated multi-level label architecture for task routing to transform job metadata into a structured routing language. This hierarchical system attaches contextual labels to every request via the GatewayContext or MarkReq, allowing the gateway router and engine connection manager to make intelligent decisions about service discovery, engine allocation, and resource quotas. By composing labels ranging from UserCreatorLabel to LoadBalanceLabel, Linkis achieves granular traffic control and optimal load distribution across heterogeneous computational workloads.
The Six-Level Label Hierarchy
Linkis organizes routing metadata into six distinct levels, each represented by specific Java classes in the org.apache.linkis.manager.label.entity package. These labels collectively form a composite routing key stored in the linkis_cg_manager_label database table.
Level 1: User and Creator Identification
The UserCreatorLabel captures the submitting user and the creator service that generated the request. Located at org.apache.linkis.manager.label.entity.engine.UserCreatorLabel, this label ensures tasks are associated with specific principals and enables user-based resource quotas. It serves as the foundation for the routing key, distinguishing requests from different sources even when targeting identical engine types.
Level 2: Engine Type Specification
The EngineTypeLabel determines which engine implementation executes the task, specifying both the engine type and version (e.g., Spark-2.4.3, Hive-3.1.3, or Trino). Found in org.apache.linkis.manager.label.entity.engine.EngineTypeLabel, this label directs requests to compatible engine instances and prevents version mismatches during task dispatch.
Level 3: Combined Route Label
Linkis synthesizes the user and engine levels into a combined_userCreator_engineType label stored in the linkis_cg_manager_label table. This route label acts as the primary lookup key for service discovery, enabling the gateway to locate the correct service instance among potentially hundreds of nodes. The RouteLabel interface defines this combined identifier used by DefaultLabelGatewayRouter during instance selection.
Level 4: Load Balancing Configuration
The LoadBalanceLabel controls parallel instance creation through its capacity and groupId fields. Defined in org.apache.linkis.manager.label.entity.entrance.LoadBalanceLabel, this label specifies the maximum number of concurrent engine instances permitted for a particular label combination. The LoadBalanceLabelEngineConnManager interprets this capacity to determine how many marks (engine slots) to create and cache for incoming requests.
Level 5: Job-Group Binding
BindEngineLabel ensures stateful pipeline execution by binding all tasks in a job group to the same engine instance. Located in org.apache.linkis.manager.label.entity.entrance.BindEngineLabel, this label uses isJobGroupHead, isJobGroupEnd, and jobGroupId fields to cache the selected engine mark for the entire job duration. This prevents race conditions in multi-stage workflows where subsequent tasks must reuse the same computational context.
Level 6: Runtime and Isolation Controls
Fine-grained execution parameters are managed through labels like EngineConnRuntimeModeLabel and ReuseExclusionLabel. These control engine reuse policies, YARN deployment modes (cluster vs. local), and isolation boundaries. When ReuseExclusionLabel is present, the system prevents oversubscription by excluding already-allocated engines from the candidate pool during load balancing decisions.
The Task Routing Flow
The routing process transforms HTTP requests into engine allocations through a multi-stage pipeline involving the gateway, metadata database, and orchestrator.
Gateway Label Parsing
When the gateway receives an HTTP request, it constructs a GatewayContext containing the label list via gatewayContext.getGatewayRoute.getLabels. The DefaultLabelGatewayRouter (located at linkis-spring-cloud-services/linkis-service-gateway/linkis-gateway-server-support/src/main/scala/org/apache/linkis/gateway/ujes/route/DefaultLabelGatewayRouter.scala) parses these labels through a chain of RouteLabelParser implementations:
// DefaultLabelGatewayRouter.selectInstance (excerpt)
val routeLabels = parseToRouteLabels(gatewayContext) // ← uses parsers
Service Instance Selection
The router queries the metadata database for service instances carrying the matching combined_userCreator_engineType label:
SELECT * FROM linkis_cg_manager_label
WHERE label_key='combined_userCreator_engineType';
After retrieving eligible instances, the router performs a roulette (random) selection among healthy nodes to distribute traffic evenly across the cluster.
Engine Mark Allocation
The LoadBalanceLabelEngineConnManager (found in linkis-orchestrator/linkis-computation-orchestrator/src/main/scala/org/apache/linkis/orchestrator/ecm/LoadBalanceLabelEngineConnManager.scala) processes the MarkReq to allocate engine connections. It extracts the LoadBalanceLabel from the request's label map:
val loadBalanceLabel = MarkReq.getLabelBuilderFactory.createLabel[LoadBalanceLabel](
LabelKeyConstant.LOAD_BALANCE_KEY,
markReq.getLabels.get(LabelKeyConstant.LOAD_BALANCE_KEY)
)
Load Balancing Mechanisms
Linkis implements sophisticated load distribution through three complementary mechanisms that operate at different stages of the task lifecycle.
Capacity-Based Mark Creation
The LoadBalanceLabel.getCapacity() value determines the maximum number of concurrent engine instances created for a specific label combination. When applyMark is invoked, the LoadBalanceLabelEngineConnManager creates and caches up to N marks based on this capacity, effectively throttling parallel execution for that workload type. This prevents any single user or engine type from monopolizing cluster resources.
Job-Group Affinity
When BindEngineLabel.isJobGroupHead is true, the system selects a random available mark and caches it for the entire job group. Subsequent tasks with the same jobGroupId retrieve this cached mark rather than requesting a new engine, ensuring pipeline consistency. The mark is removed from cache when isJobGroupEnd signals the final task completion, releasing the engine for reuse.
Reuse Exclusion Policies
The ReuseExclusionLabel prevents oversubscription by marking engines already allocated to other active marks. During the applyMark phase, the engine connection manager filters out excluded instances, ensuring load balancing distributes work across truly available resources rather than overloading busy engines.
Traffic Control Implementation
Beyond load distribution, Linkis enforces hard limits on concurrent execution through the UserRunningNumber tracker located in linkis-orchestrator/linkis-orchestrator-core/src/main/scala/org/apache/linkis/orchestrator/execution/impl/UserRunningNumber.scala.
Per-User and Per-Engine Quotas
The system constructs a unique key combining UserCreatorLabel and EngineTypeLabel values:
val key = userCreatorLabel.getStringValue + SPLIT + engineTypeLabel.getStringValue
This key increments a counter for every submitted task and decrements it upon completion. When the counter reaches the configured threshold for that key, subsequent requests are queued or rejected, preventing individual users or engine types from exhausting cluster capacity. This mechanism operates independently of the load balancing capacity controls, providing a secondary safety net against resource starvation.
Practical Implementation: Submitting a Load-Balanced Spark Job
The following Scala example demonstrates how to construct a MarkReq with multi-level labels for a Spark-IDE workload with capacity limits and job-group binding:
import org.apache.linkis.manager.label.builder.factory.LabelBuilderFactoryContext
import org.apache.linkis.manager.label.constant.LabelKeyConstant
import org.apache.linkis.manager.label.entity.entrance.{LoadBalanceLabel, BindEngineLabel}
import org.apache.linkis.manager.label.entity.engine.{UserCreatorLabel, EngineTypeLabel}
import org.apache.linkis.orchestrator.ecm.entity.MarkReq
// 1️⃣ Build the required labels via the factory
val labelFactory = LabelBuilderFactoryContext.getLabelBuilderFactory
val userCreator = labelFactory.createLabel(classOf[UserCreatorLabel])
.setUser("alice")
.setCreator("linkis")
val engineType = labelFactory.createLabel(classOf[EngineTypeLabel])
.setEngineType("spark")
.setVersion("2.4.3")
val loadBalance = labelFactory.createLabel(classOf[LoadBalanceLabel])
.setCapacity("5") // Maximum 5 concurrent Spark-IDE engines
.setGroupId("spark-ide-group")
val bindEngine = labelFactory.createLabel(classOf[BindEngineLabel])
.setJobGroupId("group-001")
.setIsJobGroupHead(true)
// 2️⃣ Assemble the MarkReq with all labels
val markReq = new MarkReq()
markReq.getLabels.put(LabelKeyConstant.USERCREATOR_KEY, userCreator.getValue)
markReq.getLabels.put(LabelKeyConstant.ENGINETYPE_KEY, engineType.getValue)
markReq.getLabels.put(LabelKeyConstant.LOAD_BALANCE_KEY, loadBalance.getValue)
markReq.getLabels.put(LabelKeyConstant.BIND_ENGINE_KEY, bindEngine.getValue)
// 3️⃣ Apply a mark – triggers load balancing and engine allocation
val engineConnManager = new LoadBalanceLabelEngineConnManager()
val mark = engineConnManager.applyMark(markReq) // ← Load balancing occurs here
This configuration ensures that up to five Spark-IDE engines can run concurrently for the specified user, while all tasks belonging to group-001 share the same engine instance for stateful execution.
Summary
Apache Linkis implements multi-level label architectures for task routing through a composable hierarchy of six label types. Key technical implementations include:
- Hierarchical composition: User and engine labels combine into
combined_userCreator_engineTyperoute labels for service discovery inlinkis_cg_manager_label - Gateway routing:
DefaultLabelGatewayRouterparses labels and performs roulette selection across matching service instances - Capacity management:
LoadBalanceLabelEngineConnManagerrespectsLoadBalanceLabel.capacityto limit concurrent engine instances - Stateful pipelines:
BindEngineLabelcaches engine marks for job groups, ensuring consistent execution context across multi-stage workflows - Quota enforcement:
UserRunningNumbertracks concurrent tasks keyed byUserCreatorLabelandEngineTypeLabelcombinations to prevent resource exhaustion
Frequently Asked Questions
What is the primary purpose of the combined_userCreator_engineType label?
The combined_userCreator_engineType label serves as the fundamental routing key that merges user identity with engine specifications. Stored in the linkis_cg_manager_label database table, this composite label enables the DefaultLabelGatewayRouter to map incoming requests to specific service instances capable of executing the workload. Without this combined label, the system could not distinguish between different users requesting the same engine type or route traffic appropriately across the cluster.
How does Linkis prevent a single user from overwhelming the cluster?
Linkis implements dual-layer protection through the UserRunningNumber tracker and LoadBalanceLabel capacity controls. The UserRunningNumber class maintains atomic counters keyed by the concatenation of UserCreatorLabel and EngineTypeLabel values, rejecting new submissions when per-user limits are reached. Simultaneously, the LoadBalanceLabelEngineConnManager enforces maximum engine instance counts per label combination, ensuring that even legitimate requests cannot spawn unlimited resources beyond configured capacity thresholds.
What is the difference between LoadBalanceLabel and BindEngineLabel?
LoadBalanceLabel controls horizontal scaling by specifying how many concurrent engine instances (capacity) may exist for a particular workload type, enabling parallel execution across multiple nodes. In contrast, BindEngineLabel controls vertical consistency by forcing all tasks within a specific jobGroupId to reuse the same engine instance. While LoadBalanceLabel promotes distribution and throughput, BindEngineLabel ensures stateful operations and data locality by preventing engine switching mid-pipeline.
Where does the actual load balancing decision occur in the Linkis codebase?
The core load balancing logic resides in LoadBalanceLabelEngineConnManager.scala within the linkis-computation-orchestrator module. When applyMark is called, this class reads the LoadBalanceLabel from the MarkReq, checks existing mark caches, and either reuses available engine connections or creates new ones up to the specified capacity. For gateway-level distribution, DefaultLabelGatewayRouter performs initial roulette selection among healthy service instances matching the route label before the orchestrator applies finer-grained engine-level balancing.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →