Linkis KerberosUtils Implementation for Hadoop Secure Authentication: Configuration and Usage Guide
Apache Linkis provides the KerberosUtils class in linkis-hadoop-common to handle Kerberos authentication for Hadoop services, encapsulating keytab-based login, automatic ticket refresh, and configurable security settings via HadoopConf.
Linkis simplifies secure Hadoop cluster integration through a centralized utility that manages the complete Kerberos authentication lifecycle. The org.apache.linkis.hadoop.common.utils.KerberosUtils class abstracts the complexity of credential management, enabling engine plugins like JDBC and HBase to authenticate seamlessly without implementing custom security logic.
Core KerberosUtils Implementation
Creating Kerberos-Enabled Configurations
The createKerberosSecurityConfiguration() method builds a secure Hadoop Configuration object by retrieving the root user settings via HDFSUtils.getConfiguration(HadoopConf.HADOOP_ROOT_USER().getValue()) and explicitly setting hadoop.security.authentication to KERBEROS. This implementation resides in linkis-commons/linkis-hadoop-common/src/main/java/org/apache/linkis/hadoop/common/utils/KerberosUtils.java at lines 44-48.
// From KerberosUtils.java:44-48
Configuration conf = HDFSUtils.getConfiguration(HadoopConf.HADOOP_ROOT_USER().getValue());
conf.set("hadoop.security.authentication", "KERBEROS");
Keytab-Based User Authentication
The createKerberosSecureConfiguration(String keytab, String principal) method performs the actual Kerberos login by invoking UserGroupInformation.loginUserFromKeytab(principal, keytab), but only when the current user is not already authenticated via keytab. This prevents duplicate login attempts and potential authentication conflicts. This logic is implemented at lines 50-60 of KerberosUtils.java.
Automatic Ticket Refresh Mechanism
To prevent credential expiration during long-running tasks, runRefreshKerberosLogin() (lines 67-84) detects whether the current user authenticated via keytab or ticket cache. The method calls reloginFromKeytab() for keytab-based credentials or reloginFromTicketCache() for cached tickets. When no Kerberos credentials are present, the method returns false and logs the error without throwing exceptions.
Background Refresh Thread Management
The startKerberosRefreshThread() method initializes a single-threaded scheduler that runs for the JVM's lifetime, starting at lines 156-190 in KerberosUtils.java. This thread periodically invokes runRefreshKerberosLogin() at intervals specified by the LINKIS_KERBEROS_REFRESH_INTERVAL environment variable (defaulting to 43200 seconds or 12 hours). The thread respects the LINKIS_KERBEROS_KINIT_FAIL_THRESHOLD setting (default: 5), terminating after consecutive failures to prevent excessive error logging.
Configuration Properties in HadoopConf
Centralized Kerberos configuration is managed in HadoopConf.scala (linkis-commons/linkis-hadoop-common/src/main/scala/org/apache/linkis/hadoop/common/conf/HadoopConf.scala lines 24-46). All Kerberos operations validate HadoopConf.KERBEROS_ENABLE() before executing security logic.
Key configuration properties:
wds.linkis.keytab.enable: Global toggle enabling Kerberos support (default:false)wds.linkis.keytab.file: Directory path for storing keytab files (default:/appcom/keytab/)wds.linkis.keytab.host: Hostname used in keytab generation for multi-cluster deployments (default:127.0.0.1)wds.linkis.keytab.proxyuser.enable: Enables proxy user impersonation when Kerberos tickets are present (default:false)LINKIS_KERBEROS_REFRESH_INTERVAL: Environment variable defining refresh frequency in seconds (default:43200)LINKIS_KERBEROS_KINIT_FAIL_THRESHOLD: Maximum consecutive refresh failures before thread termination (default:5)
Implementation in Linkis Engine Plugins
JDBC Engine Authentication Pattern
The JDBC engine plugin demonstrates practical usage in ConnectionManager.java (lines 262-274). Before establishing database connections, the code invokes KerberosUtils.createKerberosSecureConfiguration(keytab, principal) followed by KerberosUtils.startKerberosRefreshThread() to maintain authentication throughout the query lifecycle.
HBase Engine with Proxy User Support
HBaseConnectionManager.java (around line 111) implements similar authentication with extended proxy user capabilities. When HBaseEngineConnConstant.KERBEROS_PROXY_USER is enabled, the code creates a UserGroupInformation proxy to execute HBase operations on behalf of other users while preserving the underlying Kerberos credentials.
How to Configure Kerberos in Linkis
Enable Hadoop secure authentication by completing these configuration steps:
-
Enable the Kerberos switch: Set
wds.linkis.keytab.enable=trueinlinkis.propertiesor your configuration management system. -
Deploy keytab files: Place keytab files in the directory specified by
wds.linkis.keytab.file(default/appcom/keytab/) ensuring the Linkis service user has read permissions. -
Configure refresh behavior: Export environment variables to control ticket renewal:
export LINKIS_KERBEROS_REFRESH_INTERVAL=43200 export LINKIS_KERBEROS_KINIT_FAIL_THRESHOLD=5 -
Engine-specific setup: For JDBC or HBase datasources, provide the
keytabpath andprincipalname in connection parameters. Engine plugins automatically invoke the utility methods. -
Enable proxy users (optional): Set
wds.linkis.keytab.proxyuser.enable=trueand configure the proxy user parameter in your engine settings when impersonation is required.
Code Examples
Basic Keytab Authentication
import org.apache.linkis.hadoop.common.utils.KerberosUtils;
String keytab = "/appcom/keytab/hadoop.keytab";
String principal = "hadoop/[email protected]";
// Authenticates only if not already logged in via keytab
KerberosUtils.createKerberosSecureConfiguration(keytab, principal);
Starting the Background Refresh Thread
// Initializes singleton refresh thread if Kerberos is enabled
// Safe to call multiple times; only one thread starts per JVM
KerberosUtils.startKerberosRefreshThread();
JDBC Connection Implementation
// Excerpt from ConnectionManager.java:262-274
KerberosUtils.createKerberosSecureConfiguration(keytab, principal);
KerberosUtils.startKerberosRefreshThread();
Connection conn = dataSource.getConnection();
// Execute JDBC operations with maintained Kerberos credentials
HBase with Proxy User Configuration
// As implemented in HBaseConnectionManager.java
KerberosUtils.createKerberosSecureConfiguration(keytab, principal);
KerberosUtils.startKerberosRefreshThread();
if (conf.getBoolean(HBaseEngineConnConstant.KERBEROS_PROXY_USER, false)) {
String proxyUser = conf.get(HBaseEngineConnConstant.KERBEROS_PROXY_USER);
UserGroupInformation ugi = UserGroupInformation.createProxyUser(
proxyUser,
UserGroupInformation.getCurrentUser()
);
// Execute operations within ugi.doAs(...)
}
Summary
- KerberosUtils provides centralized authentication management in
linkis-commons/linkis-hadoop-common, handling the complete lifecycle from initial login to ticket refresh. - Core methods include
createKerberosSecureConfiguration()for authentication andstartKerberosRefreshThread()for credential maintenance. - Configuration combines
HadoopConfproperties (likewds.linkis.keytab.enable) with environment variables (LINKIS_KERBEROS_REFRESH_INTERVAL). - Engine plugins (JDBC, HBase) embed these utilities automatically; custom implementations require only invoking the utility methods with valid keytab and principal parameters.
- Proxy user support enables secure multi-tenant access when
wds.linkis.keytab.proxyuser.enableis configured.
Frequently Asked Questions
What is the default refresh interval for Kerberos tickets in Linkis?
The default refresh interval is 43200 seconds (12 hours), defined by the LINKIS_KERBEROS_REFRESH_INTERVAL environment variable. This interval balances security requirements with KDC load, ensuring tickets remain valid without excessive authentication requests.
How does Linkis handle Kerberos authentication failures?
Linkis tracks consecutive refresh failures using the LINKIS_KERBEROS_KINIT_FAIL_THRESHOLD environment variable (default: 5). When failures exceed this threshold, the background refresh thread stops attempting renewals, preventing log spam and allowing administrators to investigate connectivity or credential issues.
Can I use KerberosUtils for custom engine plugins?
Yes. Any Linkis engine plugin can call KerberosUtils.createKerberosSecureConfiguration(keytab, principal) followed by KerberosUtils.startKerberosRefreshThread(). The utility class implements singleton patterns and JVM-wide state management, ensuring thread-safe operation across multiple engine instances without duplicate login attempts.
Where should keytab files be located for Linkis to access them?
By default, Linkis expects keytab files in /appcom/keytab/, as specified by the wds.linkis.keytab.file property in HadoopConf.scala. Ensure the Linkis service account possesses read permissions for this directory and all contained keytab files, and configure the specific principal and keytab path in your datasource connection properties.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →