Supported Data Source Types in Linkis DataSourceManager and Custom Parameter Validators
Apache Linkis supports Kafka, MongoDB, and Elasticsearch out of the box, and you can implement custom validation by creating a class that implements ParameterValidateStrategy and registering it with DataSourceParameterValidator.
The DataSourceManager module in the apache/linkis repository provides a centralized way to manage metadata connections across the big-data middleware. Understanding the supported data source types and how to extend validation logic ensures your data source configurations remain robust and compliant with organizational policies.
Built-In Data Source Types in Linkis
Linkis defines its supported data sources through the DataSourceTypeEnum enumeration. As of the current implementation, the DataSourceManager recognizes three built-in types that it can query and generate Spark SQL for.
DataSourceTypeEnum Definition
The enum is located in linkis-public-enhancements/linkis-pes-common/src/main/java/org/apache/linkis/metadata/query/common/domain/DataSourceTypeEnum.java. It exposes the following constants:
| Data Source Type | Enum Constant | Configuration Value |
|---|---|---|
| Kafka | KAFKA |
"kafka" |
| MongoDB | MONGODB |
"mongodb" |
| Elasticsearch | ELASTICSEARCH |
"elasticsearch" |
These values are used throughout the configuration layer to identify which connector and metadata query logic to invoke. If you attempt to register a data source with a type not listed in this enum, the core manager will not recognize it.
How to Implement Custom DataSourceParameter Validators
Linkis validates data source parameters using a Strategy Pattern that decouples validation rules from the core manager logic. This allows you to inject custom business rules—such as range checks, credential complexity, or environment-specific constraints—without modifying the core codebase.
The ParameterValidateStrategy Interface
Every custom validator must implement the ParameterValidateStrategy interface located at:
linkis-public-enhancements/linkis-datasource/linkis-datasource-manager/server/src/main/java/org/apache/linkis/datasourcemanager/core/validate/ParameterValidateStrategy.java
The interface defines two methods:
accept(DataSourceParamKeyDefinition.ValueType valueType)– Determines whether this strategy should process the parameter based on its value type or other criteria.validate(DataSourceParamKeyDefinition keyDef, Object actualValue)– Executes the validation logic and throwsParameterValidateExceptionif the value is invalid.
Default Validation Strategies
The DataSourceParameterValidator class acts as the orchestrator. It maintains a list of strategies and invokes them during data source creation or updates. Located at:
linkis-public-enhancements/linkis-datasource/linkis-datasource-manager/server/src/main/java/org/apache/linkis/datasourcemanager/core/validate/DataSourceParameterValidator.java
On startup, it registers two built-in strategies:
@PostConstruct
public void initToRegister() {
registerStrategy(new TypeParameterValidateStrategy()); // Generic type checks
registerStrategy(new RegExpParameterValidateStrategy()); // Email/text regex checks
}
- TypeParameterValidateStrategy – Validates that the provided value matches the expected Java type (e.g.,
LONG,STRING). - RegExpParameterValidateStrategy – Applies regex patterns defined in the parameter key definition (e.g., email format validation).
Creating a Custom Validator
To create a custom validator, implement ParameterValidateStrategy and annotate the class with @Component so Spring auto-discovers it. The following example validates that a numeric batchSize parameter falls within the range 1–1000:
package org.apache.linkis.datasourcemanager.core.validate.custom;
import org.apache.linkis.datasourcemanager.common.domain.DataSourceParamKeyDefinition;
import org.apache.linkis.datasourcemanager.core.validate.ParameterValidateException;
import org.apache.linkis.datasourcemanager.core.validate.ParameterValidateStrategy;
import org.springframework.stereotype.Component;
/**
* Validates that the batchSize parameter is between 1 and 1000.
*/
@Component
public class BatchSizeRangeValidator implements ParameterValidateStrategy {
/** Activate only for NUMBER type values */
@Override
public boolean accept(DataSourceParamKeyDefinition.ValueType valueType) {
return DataSourceParamKeyDefinition.ValueType.NUMBER.equals(valueType);
}
/** Perform range validation */
@Override
public Object validate(DataSourceParamKeyDefinition keyDef, Object actualValue)
throws ParameterValidateException {
// Filter by specific key name
if (!"batchSize".equalsIgnoreCase(keyDef.getKey())) {
return actualValue; // Pass through if not our target key
}
Long val = (Long) actualValue;
if (val < 1L || val > 1000L) {
throw new ParameterValidateException(
"batchSize must be between 1 and 1000, but got " + val);
}
return actualValue;
}
}
Key implementation details:
- Targeting specific keys – Use
keyDef.getKey()insidevalidate()to apply logic only to parameters you care about, even ifaccept()matches multiple types. - Exception handling – Throw
ParameterValidateExceptionwith a descriptive message to halt the validation pipeline and return an error to the user. - Return value – Return the (potentially transformed) value to support chaining; subsequent validators receive this output.
Registering Your Custom Validator
Because the class carries the @Component annotation, Spring automatically instantiates it and adds it to the application context. The DataSourceParameterValidator discovers any bean implementing ParameterValidateStrategy during its @PostConstruct initialization and adds it to the internal strategy list.
For manual registration (e.g., conditional registration based on profiles), inject the validator and call registerStrategy():
@Autowired
private DataSourceParameterValidator parameterValidator;
@PostConstruct
public void registerCustomValidators() {
parameterValidator.registerStrategy(new BatchSizeRangeValidator());
}
Validation Execution Flow
When a user creates or updates a data source, the service layer invokes:
dataSourceParameterValidator.validate(paramKeyDefinitions, parameters);
The method iterates over each DataSourceParamKeyDefinition and runs all registered strategies where accept() returns true. Your custom validator participates in this pipeline automatically, ensuring that constraints are enforced before the data source is persisted.
Key Source Files and References
Summary
- Supported types – Linkis DataSourceManager natively supports Kafka, MongoDB, and Elasticsearch as defined in
DataSourceTypeEnum.java. - Validation framework – The system uses a strategy-based validation framework centered on
ParameterValidateStrategyandDataSourceParameterValidator. - Implementation – Create a class implementing
ParameterValidateStrategy, useaccept()to filter by value type, and implementvalidate()to enforce rules. - Error handling – Throw
ParameterValidateExceptionto reject invalid parameters with clear error messages. - Registration – Annotate with
@Componentfor automatic discovery, or manually register viaDataSourceParameterValidator.registerStrategy(). - Execution – Validators run automatically during data source create/update operations, ensuring all parameters meet defined constraints before persistence.
Frequently Asked Questions
What data source types does Linkis support by default?
Linkis supports Kafka, MongoDB, and Elasticsearch. These are defined as enum constants in DataSourceTypeEnum.java and represent the only built-in types that the DataSourceManager can query and generate Spark SQL for without additional extensions.
How do I add a new data source type to Linkis?
Adding a new type requires extending DataSourceTypeEnum.java to include the new constant and value, then implementing the corresponding metadata query capabilities in the linkis-metadata-query modules. However, the parameter validation framework operates independently—you can validate parameters for custom types using the same ParameterValidateStrategy approach described above, provided your type identifier is passed through the configuration.
Can I validate parameters based on the specific data source type rather than just the value type?
Yes. While the accept() method receives the ValueType, the validate() method receives the full DataSourceParamKeyDefinition object. You can access keyDef.getKey() or store a reference to the data source type in your validator class to apply logic conditionally. For type-specific validation, check the key definition’s metadata or maintain a mapping of keys to allowed ranges within your validator.
What happens if multiple validators accept the same parameter?
All registered strategies that return true from accept() execute sequentially in registration order. Each validator receives the output of the previous one as the actualValue parameter. This chaining allows for cumulative validation or transformation, but ensure that your validators do not conflict—for example, one validator normalizing a value that another expects in raw form.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →