# How to Use NLP Encoders and Pre-trained Models in TensorFlow Models

> Discover how to use NLP encoders and pre-trained models in TensorFlow Models. Leverage the `build_encoder` factory for BERT, ALBERT, and more with pre-trained weights.

- Repository: [tensorflow/models](https://github.com/tensorflow/models)
- Tags: tutorial
- Published: 2026-02-28

---

**The TensorFlow Models repository provides a unified `build_encoder` factory in [`official/nlp/configs/encoders.py`](https://github.com/tensorflow/models/blob/main/official/nlp/configs/encoders.py) that constructs any transformer-based encoder (BERT, ALBERT, BigBird, etc.) from a configuration object, with pre-trained weights loadable via TF-Hub or converted checkpoints.**

The `tensorflow/models` official NLP package simplifies working with **NLP encoders and pre-trained models** through a consistent configuration-driven API. Whether you need BERT for classification or BigBird for long-document processing, the repository offers a single entry point to construct, load, and fine-tune transformer architectures without rewriting boilerplate instantiation code.

## Configuring Encoder Architectures

All supported encoders are defined through **encoder configuration dataclasses** located in [`official/nlp/configs/encoders.py`](https://github.com/tensorflow/models/blob/main/official/nlp/configs/encoders.py). Each architecture has its own dataclass—such as `BertEncoderConfig`, `AlbertEncoderConfig`, or `BigBirdEncoderConfig`—that exposes hyperparameters including `hidden_size`, `num_layers`, `num_attention_heads`, and `dropout_rate`.

To select an encoder type, wrap the specific config inside the **`EncoderConfig`** (OneOfConfig) wrapper:

```python
from official.nlp.configs import encoders
from official.nlp.configs.encoders import EncoderConfig

my_cfg = EncoderConfig(
    type="bigbird",
    bigbird=encoders.BigBirdEncoderConfig(
        hidden_size=1024,
        num_layers=12,
        num_attention_heads=16,
        max_position_embeddings=4096,
        dropout_rate=0.1,
        norm_first=True,
    )
)

```

The `type` field determines which sub-config the factory reads. All other sub-configs are ignored, keeping the API simple while exposing every encoder’s full parameter set.

## Building Encoders with the Factory Pattern

The **`build_encoder`** function serves as the Gin-configurable factory that transforms configuration objects into ready-to-use `tf.keras.layers.Layer` instances:

```python
from official.nlp.configs.encoders import build_encoder

encoder = build_encoder(my_cfg)

```

When invoked, `build_encoder` performs three critical operations according to the source code in [`official/nlp/configs/encoders.py`](https://github.com/tensorflow/models/blob/main/official/nlp/configs/encoders.py):

- Resolves the chosen encoder class (e.g., `BigBirdEncoder`) from `official.nlp.modeling.networks`
- Builds an embedding layer automatically or reuses one passed via the `embedding_layer=` argument
- Wires encoder-specific attention and mask objects (such as `layers.BigBirdAttention` for BigBird)

The returned layer produces a dictionary of outputs containing `sequence_output` and `pooled_output`, compatible with downstream task heads.

## Loading Pre-trained Weights

The repository supports two primary methods for loading **pre-trained weights** into constructed encoders: TensorFlow Hub modules and converted legacy checkpoints.

### Loading from TensorFlow Hub

For models available on TF-Hub (such as BERT-base), use the **`get_encoder_from_hub`** utility in [`official/nlp/tasks/utils.py`](https://github.com/tensorflow/models/blob/main/official/nlp/tasks/utils.py):

```python
from official.nlp.tasks import utils as task_utils

hub_path = "https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4"
hub_encoder = task_utils.get_encoder_from_hub(hub_path)

```

This function constructs the three required input placeholders (`input_word_ids`, `input_mask`, `input_type_ids`), feeds them to a Hub KerasLayer, and returns a `tf.keras.Model` whose output dictionary format matches native encoders.

### Restoring from Legacy Checkpoints

When working with original TensorFlow 1 checkpoints, use the converter scripts in `official/nlp/tools/` to produce TF-2 compatible formats:

```bash
python -m official.nlp.tools.tf2_bert_encoder_checkpoint_converter \
    --tf1_checkpoint_path=/tmp/bert_ckpt \
    --tf2_checkpoint_path=/tmp/bert_tf2_ckpt

```

After conversion, restore weights into your built encoder:

```python
import tensorflow as tf

ckpt = tf.train.Checkpoint(encoder=encoder)
ckpt.restore("/tmp/bert_tf2_ckpt").expect_partial()

```

Similar converters exist for ALBERT ([`tf2_albert_encoder_checkpoint_converter.py`](https://github.com/tensorflow/models/blob/main/tf2_albert_encoder_checkpoint_converter.py)) and other architectures.

## Integrating Encoders into Downstream Tasks

All official NLP tasks accept an `encoder_cfg` argument, internally calling `build_encoder` so you rarely need to instantiate the encoder manually. The **Sentence Prediction Task** demonstrates this pattern:

```python
from official.nlp.tasks import sentence_prediction

model = sentence_prediction.SentencePredictionTask(
    model_config=sentence_prediction.SentencePredictionConfig(
        encoder=my_cfg,
        num_classes=3,
        loss="sparse_categorical_crossentropy",
        metrics=["accuracy"],
    )
)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=3e-5))

```

This approach ensures the encoder configuration remains centralized while the task handles input preprocessing, model assembly, and metric computation.

## Complete End-to-End Example

The following script demonstrates fine-tuning BERT on a classification task using the configuration-driven API:

```python
import tensorflow as tf
from official.nlp.configs import encoders
from official.nlp.configs.encoders import EncoderConfig, build_encoder
from official.nlp.tasks import sentence_prediction

# 1. Configure BERT-base

cfg = EncoderConfig(
    type="bert",
    bert=encoders.BertEncoderConfig(
        hidden_size=768,
        num_layers=12,
        num_attention_heads=12,
        intermediate_size=3072,
        dropout_rate=0.1,
        max_position_embeddings=512,
    )
)

# 2. Build encoder

encoder = build_encoder(cfg)

# 3. Optional: Restore TF-2 checkpoint

# ckpt = tf.train.Checkpoint(encoder=encoder)

# ckpt.restore("/path/to/bert_tf2_ckpt").expect_partial()

# 4. Assemble downstream task

task_cfg = sentence_prediction.SentencePredictionConfig(
    encoder=cfg,
    num_classes=3,
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

model = sentence_prediction.SentencePredictionTask(model_config=task_cfg)
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=2e-5))

# 5. Train

train_ds = tf.data.TFRecordDataset("train.tfrecord").batch(32)
val_ds = tf.data.TFRecordDataset("dev.tfrecord").batch(32)
model.fit(train_ds, epochs=3, validation_data=val_ds)

```

## Summary

- **`build_encoder`** in [`official/nlp/configs/encoders.py`](https://github.com/tensorflow/models/blob/main/official/nlp/configs/encoders.py) is the central factory for constructing transformer encoders from configuration objects
- **EncoderConfig** uses a `type` field to select between architectures (BERT, ALBERT, BigBird) while ignoring unused sub-configs
- **Pre-trained weights** load via `get_encoder_from_hub` for TF-Hub models or checkpoint converters for legacy TF-1 weights
- **Task APIs** automatically invoke `build_encoder`, streamlining the path from configuration to training loop
- All encoder dataclasses expose full hyperparameter control including hidden sizes, attention heads, and normalization ordering

## Frequently Asked Questions

### How do I switch between different encoder architectures?

Change the `type` parameter in `EncoderConfig` and provide the corresponding sub-config. For example, set `type="albert"` and populate the `albert=` field with `AlbertEncoderConfig`, or use `type="bert"` with `BertEncoderConfig`. The factory automatically instantiates the correct network class from `official.nlp.modeling.networks` based on this selection.

### Can I load pre-trained weights from TensorFlow Hub?

Yes. Use `task_utils.get_encoder_from_hub(hub_url)` from [`official/nlp/tasks/utils.py`](https://github.com/tensorflow/models/blob/main/official/nlp/tasks/utils.py) to wrap a Hub module. This returns a Keras Model compatible with the task API. When using Hub encoders directly in task configurations, set the encoder field to the Hub model instance rather than a config object.

### How do I convert legacy TensorFlow 1 checkpoints for TF2 encoders?

Run the architecture-specific converter scripts located in `official/nlp/tools/`. For BERT, execute `python -m official.nlp.tools.tf2_bert_encoder_checkpoint_converter` with `--tf1_checkpoint_path` and `--tf2_checkpoint_path` arguments. ALBERT and other encoders have equivalent converters. The output checkpoint restores into TF2 encoder instances using `tf.train.Checkpoint`.

### Should I use `build_encoder` directly or the Task API?

Use the **Task API** for standard fine-tuning workflows, as it handles `build_encoder` invocation, input preprocessing, and model compilation automatically. Call **`build_encoder`** directly when you need custom embedding layers, specialized weight restoration logic, or when integrating the encoder into non-standard architectures outside the official task framework.