# Linkis BML Resource Versioning Mechanism: How to Implement Custom Cloud Storage Helpers

> Explore Apache Linkis BML resource versioning and learn to implement custom cloud storage helpers for services like Alibaba OSS or Google Cloud Storage. Master resource management with Linkis.

- Repository: [The Apache Software Foundation/linkis](https://github.com/apache/linkis)
- Tags: internals
- Published: 2026-02-24

---

**Apache Linkis BML (Base Material Library) implements a monotonic versioning system where every upload or update generates a unique `resourceId` and incremental `version` string, while storage backends are abstracted through the `ResourceHelper` interface allowing custom implementations for cloud providers like Alibaba OSS or Google Cloud Storage.**

The Base Material Library (BML) in Apache Linkis provides centralized resource management for scripts, datasets, and configuration files. Understanding the **BML resource versioning mechanism** is essential for building reliable data pipelines that require historical tracking and rollback capabilities. This article examines the version control implementation in [`BmlProtocol.scala`](https://github.com/apache/linkis/blob/main/BmlProtocol.scala) and demonstrates how to extend BML with **custom resource helpers for cloud storage** by implementing the `ResourceHelper` interface.

## How BML Resource Versioning Works Internally

### Version-Aware Protocol Definitions

In [`linkis-public-enhancements/linkis-pes-common/src/main/scala/org/apache/linkis/bml/protocol/BmlProtocol.scala`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-common/src/main/scala/org/apache/linkis/bml/protocol/BmlProtocol.scala), the versioning semantics are defined through case classes. Every operation returns either `BmlUploadResponse` or `BmlUpdateResponse`, both containing `resourceId` and `version` fields. The `Version` case class and `ResourceVersions` collection enable the server to return complete version histories via `BmlResourceVersionsResponse`, while `BmlRollbackVersionResponse` supports reverting to previous states.

The version string follows a monotonically increasing pattern—typically starting at `"0"` for initial uploads and incrementing by one for each subsequent update.

### The Version Lifecycle

The BML server manages four primary versioned operations:

1. **Upload**: `client.uploadResource(user, fileName, stream)` returns `BmlUploadResponse(resourceId, version="0")`, establishing the initial version.
2. **Update**: `client.updateShareResource(user, resourceId, newFileName, stream)` creates `BmlUpdateResponse(resourceId, version="1")`, generating a new immutable version while preserving history.
3. **Query**: `client.downloadShareResource(user, resourceId, version)` retrieves specific byte streams; passing `null` fetches the latest version.
4. **Rollback**: `client.rollbackVersion(user, resourceId, targetVersion)` returns `BmlRollbackVersionResponse`, reverting the resource to a specified historical version.

This flow ensures immutable version history where updates never overwrite existing data.

### Client-Side Version Management

The `BMLHelper` class in [`linkis-public-enhancements/linkis-pes-publicservice/src/main/scala/org/apache/linkis/filesystem/bml/BMLHelper.scala`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-pes-publicservice/src/main/scala/org/apache/linkis/filesystem/bml/BMLHelper.scala) provides the public API for versioned operations. It constructs a `BmlClient` via `BmlClientFactory` and returns Java Maps containing `resourceId` and `version` strings. When querying, the helper accepts an explicit version parameter—if null, the server returns the most recent version automatically.

## Implementing Custom Cloud Storage Resource Helpers

### The ResourceHelper Interface Contract

Storage abstraction in BML is handled by the `ResourceHelper` interface located in [`linkis-public-enhancements/linkis-bml-server/src/main/java/org/apache/linkis/bml/common/ResourceHelper.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-bml-server/src/main/java/org/apache/linkis/bml/common/ResourceHelper.java). This contract defines six critical methods that any custom implementation must satisfy:

- **`upload(String path, String user, InputStream inputStream, StringBuilder md5Holder, boolean overwrite)`**: Writes bytes to storage and returns the file size.
- **`generatePath(String user, String fileName, Map<String, Object> properties)`**: Constructs unique storage paths following schema-specific conventions.
- **`getSchema()`**: Returns the URI scheme (e.g., `"oss://"`, `"gcs://"`) for routing.
- **`checkIfExists(String path, String user)`**: Verifies resource availability.
- **`checkBmlResourceStoragePrefixPathIfChanged(String path)`**: Validates that paths match the configured storage prefix.
- **`update(String path)`**: Handles overwrite semantics (often delegates to upload with overwrite=true).

### Step 1: Implementing the ResourceHelper Interface

To add support for a cloud provider like Alibaba Cloud OSS, create a class implementing `ResourceHelper`. The implementation must handle SDK initialization, path generation with date-based organization, and MD5 computation:

```java
package org.apache.linkis.bml.common;

import org.apache.linkis.bml.conf.BmlServerConfiguration;
import java.io.InputStream;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Map;

public class OssResourceHelper implements ResourceHelper {
    private static final String SCHEMA = "oss://";
    
    @Override
    public long upload(String path, String user, InputStream inputStream, 
                       StringBuilder md5Holder, boolean overwrite) 
                       throws UploadResourceException {
        // Initialize OSS client using Alibaba SDK
        // Write stream to bucket, compute MD5 if md5Holder != null
        // Return bytes written
        return 0; // Actual implementation returns size
    }
    
    @Override
    public String generatePath(String user, String fileName, 
                               Map<String, Object> properties) {
        SimpleDateFormat fmt = new SimpleDateFormat("yyyyMMdd");
        String date = fmt.format(new Date());
        return SCHEMA + BmlServerConfiguration.BML_OSS_PREFIX().getValue() 
               + "/" + user + "/bml/" + date + "/" + fileName;
    }
    
    @Override
    public String getSchema() {
        return SCHEMA;
    }
    
    @Override
    public boolean checkIfExists(String path, String user) {
        // OSS SDK existence check
        return false;
    }
    
    @Override
    public boolean checkBmlResourceStoragePrefixPathIfChanged(String path) {
        String prefix = SCHEMA + BmlServerConfiguration.BML_OSS_PREFIX().getValue();
        return !path.startsWith(prefix);
    }
    
    @Override
    public void update(String path) {
        // Delegate to upload with overwrite=true
    }
}

```

### Step 2: Registering in ResourceHelperFactory

Register the helper in [`linkis-public-enhancements/linkis-bml-server/src/main/java/org/apache/linkis/bml/common/ResourceHelperFactory.java`](https://github.com/apache/linkis/blob/main/linkis-public-enhancements/linkis-bml-server/src/main/java/org/apache/linkis/bml/common/ResourceHelperFactory.java):

```java
public class ResourceHelperFactory {
    private static final ResourceHelper OSS_RESOURCE_HELPER = new OssResourceHelper();
    
    public static ResourceHelper getResourceHelper() {
        String fsType = BmlServerConfiguration.BML_FILESYSTEM_TYPE().getValue();
        if ("hdfs".equals(fsType)) {
            return HDFS_RESOURCE_HELPER;
        } else if ("s3".equals(fsType)) {
            return S3_RESOURCE_HELPER;
        } else if ("oss".equals(fsType)) {
            return OSS_RESOURCE_HELPER;  // Custom helper
        } else {
            return LOCAL_RESOURCE_HELPER;
        }
    }
}

```

### Step 3: Configuring the Storage Backend

Add configuration properties in `linkis-bml-server/conf/bml.properties`:

```properties
wds.linkis.bml.filesystem.type=oss
wds.linkis.bml.oss.prefix=linkis-bml

```

The factory reads `BML_FILESYSTEM_TYPE` from `BmlServerConfiguration` at runtime to instantiate the correct helper.

## Practical Usage Examples

### Uploading and Versioning Resources via BMLHelper

Using the Scala API to manage versioned resources:

```scala
import org.apache.linkis.filesystem.bml.BMLHelper

val bml = new BMLHelper()

// Initial upload creates version "0"
val uploadResult = bml.upload("alice", """{"config":"production"}""", "app.json")
val resourceId = uploadResult.get("resourceId").asInstanceOf[String]  // e.g., "bml-12345"
val version0 = uploadResult.get("version").asInstanceOf[String]       // "0"

// Update creates version "1"
val updateResult = bml.update("alice", resourceId, """{"config":"staging"}""")
val version1 = updateResult.get("version").asInstanceOf[String]       // "1"

```

### Retrieving Specific Versions and Rolling Back

Access historical versions or revert changes:

```scala
// Query specific version
val v0Data = bml.query("alice", resourceId, "0")
val inputStream = v0Data.get("stream").asInstanceOf[java.io.InputStream]

// Rollback to version 0 (creates new version pointing to old content)
val rollbackResult = bml.update("alice", resourceId, """{"config":"production"}""")

```

All operations transparently use the configured `ResourceHelper` (OSS, HDFS, or S3) while maintaining consistent versioning semantics.

## Summary

- **BML resource versioning** assigns monotonically increasing version strings (starting at "0") to every upload and update operation, storing these mappings in the protocol layer defined in [`BmlProtocol.scala`](https://github.com/apache/linkis/blob/main/BmlProtocol.scala).
- The **version lifecycle** includes upload, update, query (with optional version parameter), and rollback operations, all managed server-side and exposed through [`BMLHelper.scala`](https://github.com/apache/linkis/blob/main/BMLHelper.scala).
- **Storage abstraction** is achieved via the `ResourceHelper` interface in [`ResourceHelper.java`](https://github.com/apache/linkis/blob/main/ResourceHelper.java), which decouples versioning logic from physical storage.
- **Custom implementations** require implementing six methods (upload, generatePath, getSchema, checkIfExists, checkBmlResourceStoragePrefixPathIfChanged, update), registering the instance in `ResourceHelperFactory`, and setting `wds.linkis.bml.filesystem.type`.
- The versioning mechanism remains identical regardless of storage backend—whether HDFS, S3, or custom cloud providers—because version metadata is managed independently of byte storage.

## Frequently Asked Questions

### How does BML handle version conflicts during concurrent updates?

The BML server generates versions atomically using internal counters or timestamps, ensuring that concurrent `update` operations receive unique, sequential version strings without collisions. Clients receive the specific version assigned to their transaction in the `BmlUpdateResponse`.

### Can I migrate existing resources from HDFS to a custom cloud storage helper?

Yes. Since the versioning metadata (resourceId and version strings) is stored separately from the physical bytes, you can implement a migration script that reads existing resources via the HDFS helper and re-uploads them using your custom helper. The resources will receive new version histories in the target storage while maintaining the same logical resourceIds.

### What happens to historical versions when I delete a resource?

Resource deletion behavior depends on the specific `ResourceHelper` implementation. The BML protocol supports version-specific deletion, but typically, deleting a resource removes all its versions from storage. Implement custom archival logic in your `ResourceHelper.delete` method if you need to preserve historical versions after resource deletion.

### Is the version string format configurable or strictly numeric?

While the default implementation uses numeric strings (0, 1, 2...), the version field in [`BmlProtocol.scala`](https://github.com/apache/linkis/blob/main/BmlProtocol.scala) is a String type, allowing custom server implementations to use timestamp-based or semantic versioning formats. However, the standard `BMLHelper` assumes monotonic ordering, so custom formats require corresponding client adjustments.