# How the InstanceLabel Service Enables Service Discovery in Linkis and InsLabelCacheConfiguration Options Explained

> Discover how Linkis' InstanceLabel service uses label-driven registration and caching for efficient service discovery. Explore InsLabelCacheConfiguration options.

- Repository: [The Apache Software Foundation/linkis](https://github.com/apache/linkis)
- Tags: internals
- Published: 2026-02-24

---

**The InstanceLabel service in Apache Linkis provides a database-backed, label-driven service discovery mechanism where components register themselves with key-value metadata labels, enabling dynamic lookup of service instances while caching hot data in configurable Guava caches to reduce database round-trips.**

The **InstanceLabel service** serves as the backbone of service discovery in Apache Linkis, allowing engine executors, context servers, and public-service components to advertise their capabilities through arbitrary key-value labels. By persisting these labels to relational tables and maintaining an in-memory cache governed by `InsLabelCacheConfiguration`, the system enables efficient, real-time service location without overwhelming the underlying database.

## Architecture of the InstanceLabel Service

The service operates as a lightweight RPC server that manages the lifecycle of **ServiceInstance** metadata. Components interact with it through `InstanceLabelClient` to register labels, while discovery callers query the service to resolve instances matching specific criteria.

### Label Registration and Refresh Workflow

When a component such as the Context Service (CS) server starts, it constructs a `Map<String,Object>` of labels describing its role and invokes `InstanceLabelClient.refreshLabelsToInstance`. This RPC reaches `DefaultInsLabelService.refreshLabelsToInstance` in [`linkis-public-enhancements/linkis-instance-label-server/src/main/java/org/apache/linkis/instance/label/service/impl/DefaultInsLabelService.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-instance-label-server/src/main/java/org/apache/linkis/instance/label/service/impl/DefaultInsLabelService.java).

The implementation follows a transactional pattern:

1. **Remove existing relations** for the instance via `removeLabelsFromInstance`
2. **Convert labels** to `InsPersistenceLabel` objects using `toInsPersistenceLabels`
3. **Persist labels** to the `instance_label` table and instances to `instance_info` via `doInsertInsLabels` and `doInsertInstance`
4. **Create relations** in the `ins_label_relation` many-to-many join table using `insLabelRelationDao.insertRelations`

Batch operations respect the `InsLabelConf.DB_PERSIST_BATCH_SIZE` limit, which defaults to **100** rows per transaction.

### Database Schema and Persistence Model

The persistence layer uses three core tables defined in [`linkis-dist/package/db/module/linkis_instance_label.sql`](https://github.com/apache/linkis/blob/main/linkis-dist/package/db/module/linkis_instance_label.sql):

- **`instance_label`**: Stores label definitions (key, value, string value)
- **`instance_info`**: Stores service instance records (application name, instance identifier)
- **`ins_label_relation`**: Maps label IDs to instance IDs for many-to-many relationships

The `DefaultInsLabelService` uses asynchronous cleanup to manage orphan labels. An internal consumer queue configured via `InsLabelConf` periodically removes stale entries using a batch strategy.

### Service Discovery Query Mechanism

Discovery callers invoke `DefaultInsLabelService.searchInstancesByLabels`, which converts query labels into `InsPersistenceLabel` objects and performs a relational lookup against `ins_label_relation`. The method returns a `List<ServiceInstance>` containing all matching instances, enabling load balancing and routing decisions based on label criteria such as `engineType=spark` or route keys.

## InsLabelCacheConfiguration Options

The `InsLabelConf` class centralizes all tunable parameters for the InstanceLabel service, controlling both database persistence behavior and in-memory caching characteristics. These configurations allow administrators to balance data freshness against system throughput.

### Persistence and Async Processing Settings

| Configuration Key | Default | Description |
|-------------------|---------|-------------|
| `wds.linkis.instance.label.persist.batch.size` | **100** | Maximum rows per batch insert when persisting labels and instances to the database. |
| `wds.linkis.instance.label.async.queue.capacity` | **1000** | Capacity of the internal queue that buffers orphan label cleanup tasks. |
| `wds.linkis.instance.label.async.queue.batch.size` | **100** | Number of labels processed per asynchronous cleanup batch. |
| `wds.linkis.instance.label.async.queue.interval-in-seconds` | **10** | Interval between asynchronous queue consumption cycles. |

### Cache Tuning Parameters

To minimize database load during high-frequency discovery operations, the service maintains a Guava-style cache with three namespaces: **instance**, **label**, and **appInstance**.

| Configuration Key | Default | Description |
|-------------------|---------|-------------|
| `wds.linkis.instance.label.cache.expire.time-in-seconds` | **10** | Time-to-live for cached entries before automatic eviction. |
| `wds.linkis.instance.label.cache.maximum.size` | **1000** | Maximum number of entries retained across all cache namespaces. |
| `wds.linkis.instance.label.cache.names` | **instance,label,appInstance** | Comma-separated list of cache namespaces managed by the service. |
| `linkis.discovery.server-address` | **http://localhost:20303** | URL of the Linkis service registry used to resolve remote instance addresses. |

These properties are read at service startup and injected into the cache builder within `DefaultInsLabelService`.

## Practical Implementation Examples

### Registering Labels from a Component

Components like the CS server register labels during initialization to advertise their availability:

```java
// Inside CSInstanceLabelClient.init(...)
Map<String, Object> labels = new HashMap<>(1);
labels.put(LabelKeyConstant.ROUTE_KEY, "cs_1_" + ContextServerConf.CS_LABEL_SUFFIX);

InsLabelRefreshRequest request = new InsLabelRefreshRequest();
request.setLabels(labels);
request.setServiceInstance(Sender.getThisServiceInstance());

// RPC call to InstanceLabel Server
InstanceLabelClient.getInstance().refreshLabelsToInstance(request);

```

This triggers `DefaultInsLabelService.refreshLabelsToInstance`, which atomically replaces existing labels and persists the new set to the database.

### Discovering Services by Label

Client applications query for specific service types using label-based filters:

```java
// Create a label filter
Label<String> engineLabel = new EngineInstanceLabel();
engineLabel.setLabelKey("engineType");
engineLabel.setStringValue("spark");

// Query matching instances
List<ServiceInstance> engines = 
    InstanceLabelClient.getInstance().searchInstancesByLabels(
        Collections.singletonList(engineLabel));

// Process discovered Spark engine instances
engines.forEach(instance -> 
    System.out.println("Found engine: " + instance.getInstance()));

```

Under the hood, `DefaultInsLabelService.searchInstancesByLabels` queries the `ins_label_relation` table and returns hydrated `ServiceInstance` objects.

### Configuring Cache Behavior

Override default cache settings in [`application.conf`](https://github.com/apache/linkis/blob/main/application.conf) or `linkis-instance-label.properties`:

```properties

# Extend cache lifetime for stable environments

wds.linkis.instance.label.cache.expire.time-in-seconds = 30

# Increase capacity for high-scale deployments

wds.linkis.instance.label.cache.maximum.size = 5000

# Maintain default namespaces

wds.linkis.instance.label.cache.names = instance,label,appInstance

```

These settings directly configure the Guava cache instances used by the service to store hot lookup data.

## Summary

- **DefaultInsLabelService** provides the core implementation for label attachment, persistence, and discovery queries in `linkis-instance-label-server`.
- The service uses three relational tables (`instance_label`, `instance_info`, `ins_label_relation`) to maintain many-to-many relationships between labels and service instances.
- **InsLabelConf** exposes configuration keys for batch persistence sizes, asynchronous cleanup queues, and Guava cache parameters including expiration time and maximum size.
- Client components interact via `InstanceLabelClient` to register labels during startup and query instances during runtime.
- Default cache expiration is **10 seconds** with a **1000-entry** maximum, suitable for dynamic cloud environments but tunable for stable production clusters.

## Frequently Asked Questions

### How does the InstanceLabel service handle concurrent label updates from multiple instances?

The `DefaultInsLabelService.refreshLabelsToInstance` implementation atomically removes existing relations before inserting new ones within a transactional boundary. While individual instance updates are isolated, the service does not implement global locking across different instances, relying on the database's transactional consistency to maintain relation integrity.

### What happens when the cache expires during an active service discovery query?

Cache expiration in the InstanceLabel service uses Guava's time-based eviction, which occurs during access or maintenance operations. If a query triggers expiration, the service transparently falls back to the database via `searchInstancesByLabels`, repopulating the cache with fresh data on completion. The default **10-second** TTL ensures rapid convergence of service state changes.

### Can I disable the in-memory caching entirely for debugging purposes?

While `InsLabelConf` does not provide an explicit "disable cache" flag, setting `wds.linkis.instance.label.cache.maximum.size` to **0** or `wds.linkis.instance.label.cache.expire.time-in-seconds` to **0** effectively prevents caching, forcing every `searchInstancesByLabels` call to hit the database. Note that this significantly impacts performance under load.

### Where are the database table schemas defined for the InstanceLabel service?

The DDL for `instance_label`, `instance_info`, and `ins_label_relation` tables is located in [`linkis-dist/package/db/module/linkis_instance_label.sql`](https://github.com/apache/linkis/blob/main/linkis-dist/package/db/module/linkis_instance_label.sql) within the Linkis repository. These tables store the persistent state of all registered labels and instance relationships.