Linkis CommonLockService Architecture: Distributed Locking for High-Concurrency Scenarios

The Linkis CommonLockService provides a database-backed distributed locking mechanism that uses MyBatis DAO with unique key constraints to coordinate exclusive operations across JVMs, featuring built-in retry logic and re-entrant lock support for high-concurrency environments.

Linkis CommonLockService is a lightweight, centralized locking solution shared across all Linkis components including the Application Manager (AM), EngineConn, and Scheduler. Implemented in the linkis-ps-common-lock module, this service enables distributed coordination by persisting lock state to a relational database, making it accessible across machine boundaries and data centers.

Core Architecture Components

The architecture consists of four distinct layers that separate concerns between data storage, persistence logic, and the public API.

CommonLock Entity

The CommonLock class in CommonLock.java serves as the POJO representing a lock record. It encapsulates the lockObject (unique identifier), locker (host name of the acquiring instance), timestamps, and audit fields for creator/updater tracking. This entity maps directly to the t_common_lock table which enforces uniqueness on the lockObject column to prevent duplicate acquisitions.

CommonLockMapper DAO

The CommonLockMapper interface provides the MyBatis data access layer for atomic database operations. Located in CommonLockMapper.java, it executes SQL statements using INSERT … ON DUPLICATE KEY UPDATE semantics (or database equivalent) to achieve atomic lock acquisition. This mapper handles the low-level interaction with the t_common_lock table, translating Java method calls into parameterized SQL that respects database transaction boundaries.

CommonLockService Interface

The CommonLockService interface defines the high-level contract for all locking operations. Exposed in CommonLockService.java, it declares the essential methods: lock(CommonLock lock, long timeout), reentrantLock(CommonLock lock, long timeout), unlock(CommonLock lock), and getAll(). Components depend only on this interface, enabling dependency injection and easy mocking during unit testing without requiring a live database connection.

DefaultCommonLockService Implementation

The DefaultCommonLockService class in DefaultCommonLockService.java provides the concrete implementation of the service interface. This class contains the critical retry and re-entrance logic: it attempts to acquire the lock once via the DAO, catches DataAccessException when another process holds the row, and retries with Thread.sleep(1000) until the caller-specified timeout expires. For re-entrance, it first checks if the same host already holds the lock before attempting insertion.

How Distributed Locking Works Internally

Understanding the internal flow helps optimize usage patterns for high-concurrency scenarios where multiple Linkis instances compete for shared resources.

Atomic Acquisition via Unique Constraints

When a component calls lock(), the service delegates to commonLockMapper.lock(), which attempts to insert a row into t_common_lock. The database unique index on lockObject ensures that only one insertion succeeds; subsequent attempts from other hosts trigger a constraint violation. MyBatis throws DataAccessException upon collision, which the service interprets as "lock already held" rather than a fatal error.

Retry Mechanism with Configurable Timeouts

The DefaultCommonLockService implements optimistic polling with a fixed one-second interval. If initial acquisition fails, the service enters a retry loop that sleeps for 1000 milliseconds between attempts. This continues until either the lock becomes available (insertion succeeds) or the total elapsed time exceeds the timeout parameter passed to the method. Passing -1 as the timeout value configures the service to retry indefinitely.

Re-entrant Lock Support

The reentrantLock() method prevents self-deadlock by checking existing ownership before acquisition. It queries commonLockMapper.getLockByLocker(lockObject, locker) to verify if the current host already holds the lock. If a matching row exists, the method immediately returns true without database modification, allowing the same instance to safely enter critical sections multiple times without blocking on its own previous acquisition.

Implementation Guide

Integrating CommonLockService into Linkis components requires dependency injection and proper lifecycle management of lock objects.

Acquiring a Re-entrant Lock

The following Scala example from AMHeartbeatService.scala demonstrates production usage within the Application Manager:

import org.apache.linkis.publicservice.common.lock.entity.CommonLock
import org.apache.linkis.publicservice.common.lock.service.CommonLockService
import org.apache.linkis.common.utils.Utils

@Autowired private val commonLockService: CommonLockService = null

val masterLock = new CommonLock()
masterLock.setLockObject("_MASTER_AM_HEARTBEAT_MONITOR_LOCK")
masterLock.setCreateTime(new java.util.Date())
masterLock.setUpdateTime(new java.util.Date())
masterLock.setCreator(Utils.getJvmUser)
masterLock.setUpdator(Utils.getJvmUser)
masterLock.setLocker(Utils.getLocalHostname)

// Acquire re-entrant lock with indefinite timeout (-1)
val acquired = commonLockService.reentrantLock(masterLock, -1L)
if (acquired) {
  // Execute critical section: singleton heartbeat monitoring
}

This pattern ensures that only one AM instance performs heartbeat monitoring across the cluster, while allowing the same instance to re-enter the monitoring logic safely during recursive calls.

Explicit Lock with Timeout

For Java components requiring bounded wait times, use the standard lock() method:

@Autowired
private CommonLockService commonLockService;

public boolean tryAcquireResource(String resourceId, long timeoutMs) {
    CommonLock lock = new CommonLock();
    lock.setLockObject(resourceId);
    lock.setLocker(InetAddress.getLocalHost().getHostName());
    lock.setCreator(System.getProperty("user.name"));
    lock.setCreateTime(new Date());
    lock.setUpdateTime(new Date());

    // Blocks until acquired or timeoutMs expires
    return commonLockService.lock(lock, timeoutMs);
}

Releasing Locks

Always release locks in finally blocks to prevent deadlocks during exceptions:

CommonLock lock = new CommonLock();
lock.setLockObject("CRITICAL_RESOURCE");
lock.setLocker(hostname);

try {
    if (commonLockService.lock(lock, 5000)) {
        // Process shared resource
    }
} finally {
    commonLockService.unlock(lock);
}

Database Schema and SQL

While normally accessed through the DAO, the underlying MyBatis XML in CommonLockMapper.xml reveals the atomic operation:

<insert id="lock" parameterType="CommonLock">
  INSERT INTO t_common_lock(
    lock_object, locker, create_time, update_time, creator, updator
  ) VALUES (
    #{lockObject}, #{locker}, #{createTime}, #{updateTime}, #{creator}, #{updator}
  )
  ON DUPLICATE KEY UPDATE
    update_time = VALUES(update_time),
    updator = VALUES(updator)
</insert>

The ON DUPLICATE KEY UPDATE clause ensures the statement succeeds even if the lock exists, though the row count returned indicates whether the lock was newly acquired or already present.

High-Concurrency Optimization Strategies

When deploying in scenarios with hundreds of concurrent Linkis instances, consider these implementation patterns.

Configure Aggressive Timeouts for Non-Critical Operations. The default one-second sleep interval can create thundering herd effects under extreme load. For non-essential background tasks, specify shorter timeouts (5-10 seconds) to fail fast and reschedule via the Linkis event bus rather than holding database connections.

Use Host-Based Locker Identification. Always populate the locker field with Utils.getLocalHostname() or InetAddress.getLocalHost().getHostName(). This enables the re-entrant logic to function correctly across instance restarts and prevents false lock conflicts when the same physical machine recovers from a JVM crash.

Handle DataAccessException Explicitly. While DefaultCommonLockService catches DataAccessException internally for retry logic, application code should catch org.springframework.dao.DataAccessException when calling unlock() to handle scenarios where the lock expired or was administratively cleared from the database.

Summary

  • CommonLockService provides database-backed distributed locking across Linkis components using the t_common_lock table with unique constraints on lockObject.
  • DefaultCommonLockService implements optimistic retry with one-second intervals and supports indefinite blocking (timeout = -1) or bounded waits.
  • Re-entrant locking prevents self-deadlock by checking existing ownership via getLockByLocker() before attempting insertion.
  • The MyBatis DAO uses INSERT … ON DUPLICATE KEY UPDATE for atomic acquisition, throwing DataAccessException when other hosts hold the lock.
  • Source files reside in linkis-public-enhancements/linkis-ps-common-lock, with production usage examples in AMHeartbeatService.scala.

Frequently Asked Questions

How does CommonLockService handle simultaneous lock requests from multiple instances?

When multiple instances attempt to acquire the same lock simultaneously, the database unique index on lockObject ensures only one INSERT succeeds. MyBatis throws DataAccessException for the failed attempts, triggering the retry loop in DefaultCommonLockService that polls every 1000 milliseconds until the lock becomes available or the timeout expires.

What is the difference between lock() and reentrantLock() methods?

The lock() method attempts blind insertion and retries on failure, suitable for first-time acquisition. The reentrantLock() method first queries commonLockMapper.getLockByLocker() to check if the current host already owns the lock; if so, it returns immediately without database write, preventing the current instance from deadlocking itself during nested critical sections.

Can I configure the retry interval and sleep duration?

As implemented in DefaultCommonLockService.java, the retry sleep duration is hardcoded to Thread.sleep(1000) (one second). The only configurable parameter is the timeout argument passed to the lock methods, which controls the total duration of the retry loop but not the interval between attempts. For custom intervals, you would need to extend DefaultCommonLockService or implement CommonLockService directly.

Is CommonLockService suitable for cross-data-center deployments?

Yes, because the lock state persists in a centralized relational database accessible to all Linkis nodes. As long as all data centers connect to the same database instance or a replicated cluster with strong consistency guarantees, the CommonLockService coordinates correctly across geographic boundaries, making it suitable for multi-DC high-availability architectures.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →