# How to Implement Object Detection with RetinaNet Using TensorFlow Models

> Implement object detection with RetinaNet using TensorFlow Models. Configure the RetinaNet dataclass, build the model, and train with RetinaNetTask for automatic focal loss, anchor generation, and NMS.

- Repository: [tensorflow/models](https://github.com/tensorflow/models)
- Tags: tutorial
- Published: 2026-02-28

---

**You can implement object detection with RetinaNet by configuring the `RetinaNet` dataclass in [`official/vision/configs/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/configs/retinanet.py), assembling the model via `factory.build_retinanet`, and training through the `RetinaNetTask` class which handles focal loss, anchor generation, and NMS automatically.**

RetinaNet is a one-stage dense object detector that combines a backbone-FPN feature pyramid with focal-loss training to achieve high accuracy at real-time speeds. The TensorFlow Models repository provides a modular, production-ready implementation that lets you build, train, and deploy RetinaNet without writing boilerplate code for anchor generation or post-processing. This guide walks through the architecture and provides copy-paste code examples to run RetinaNet on your own dataset.

## RetinaNet Architecture Components

The implementation in `tensorflow/models` follows a modular design where each component is instantiated through factory functions and wired together by the `RetinaNetModel` class.

- **Backbone**: Extracts multi-scale feature maps using networks like ResNet-50. Built by `backbones.factory.build_backbone` in [`official/vision/modeling/backbones/factory.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/backbones/factory.py).
- **FPN Decoder**: Merges backbone levels into a feature pyramid (P3-P7) via `decoders.factory.build_decoder` in [`official/vision/modeling/decoders/factory.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/decoders/factory.py).
- **RetinaNet Head**: Two parallel sub-heads for classification (num_classes × num_anchors scores) and box regression (4 × num_anchors offsets). Implemented in `dense_prediction_heads.RetinaNetHead` at [`official/vision/modeling/heads/dense_prediction_heads.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/heads/dense_prediction_heads.py) (lines 7-30).
- **Anchor Generator**: Generates multiscale anchor boxes for each pyramid level. Logic resides in `anchor.Anchor` within [`official/vision/ops/anchor.py`](https://github.com/tensorflow/models/blob/main/official/vision/ops/anchor.py), invoked automatically by `RetinaNetModel` when `anchor_boxes` are not supplied.
- **Detection Generator**: Performs box decoding, score thresholding, and Non-Maximum Suppression (NMS) via `detection_generator.MultilevelDetectionGenerator` in [`official/vision/modeling/layers/detection_generator.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/layers/detection_generator.py).
- **Task Controller**: `RetinaNetTask` in [`official/vision/tasks/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/tasks/retinanet.py) orchestrates the training loop, data loading, loss computation (focal + Huber), and metric tracking.

## Building a RetinaNet Model

Start by defining a configuration object and invoking the factory. The `build_retinanet` function (lines 60-76 in [`official/vision/modeling/factory.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/factory.py)) automatically constructs the backbone, decoder, head, and detection generator.

```python
from official.vision.configs import retinanet as retinanet_cfg
from official.vision.modeling import factory
import tensorflow as tf

# Configure the model

cfg = retinanet_cfg.RetinaNet()
cfg.num_classes = 91  # COCO has 91 categories

cfg.input_size = [640, 640, 3]
cfg.backbone.type = 'resnet'
cfg.backbone.resnet.depth = 50
cfg.head.num_convs = 4
cfg.head.num_filters = 256
cfg.anchor.num_scales = 3
cfg.anchor.aspect_ratios = [0.5, 1.0, 2.0]

# Build the Keras model

input_spec = tf.keras.layers.InputSpec(shape=[None] + cfg.input_size)
model = factory.build_retinanet(input_spec, cfg)

```

## Preparing the Input Pipeline

RetinaNet expects TF-Example records containing bounding boxes and class IDs. The `retinanet_input.Parser` class in [`official/vision/dataloaders/retinanet_input.py`](https://github.com/tensorflow/models/blob/main/official/vision/dataloaders/retinanet_input.py) handles augmentation and anchor matching.

```python
from official.vision.dataloaders import retinanet_input
from official.vision.dataloaders import input_reader_factory

# Initialize parser with same anchor config as the model

parser = retinanet_input.Parser(
    output_size=cfg.input_size[:2],
    min_level=cfg.min_level,
    max_level=cfg.max_level,
    num_scales=cfg.anchor.num_scales,
    aspect_ratios=cfg.anchor.aspect_ratios,
    anchor_size=cfg.anchor.anchor_size,
    dtype='bfloat16',
    match_threshold=0.5,
    unmatched_threshold=0.5,
)

# Create the dataset reader

reader = input_reader_factory.input_reader_generator(
    params=task_cfg.train_data,
    dataset_fn=dataset_fn.pick_dataset_fn('tfrecord'),
    decoder_fn=decoder.decode,
    combine_fn=input_reader.create_combine_fn(task_cfg.train_data),
    parser_fn=parser.parse_fn(is_training=True)
)
train_dataset = reader.read()

```

The parser and reader instantiation logic mirrors the implementation in `RetinaNetTask.build_inputs` (lines 20-50 in [`official/vision/tasks/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/tasks/retinanet.py)).

## Training Implementation

The `RetinaNetTask` class manages the training loop, aggregating focal loss for classification and Huber loss for box regression. Loss functions are defined in `official/vision/losses` and wired together in `task.build_losses` (lines 21-28 in [`tasks/retinanet.py`](https://github.com/tensorflow/models/blob/main/tasks/retinanet.py)).

```python
from official.vision.tasks import retinanet as retinanet_task

# Initialize task and model

task = retinanet_task.RetinaNetTask(task_cfg)
task.initialize(model)  # Loads pretrained backbone if configured

# Build optimizer

optimizer = tf.keras.optimizers.SGD(
    learning_rate=0.32 * task_cfg.train_data.global_batch_size / 256.0,
    momentum=0.9
)

# Training step reuses the task's logic

@tf.function
def train_step(batch):
    return task.train_step(batch, model, optimizer, metrics=task.build_metrics())

# Run training

for epoch in range(12):
    for batch in train_dataset:
        logs = train_step(batch)

```

## Running Inference

For inference, the model accepts images and returns post-processed detections including NMS. The forward pass in `RetinaNetModel.call` (lines 84-115 in [`retinanet_model.py`](https://github.com/tensorflow/models/blob/main/retinanet_model.py)) handles anchor generation, head inference, and detection generation.

```python

# Build model for inference (optionally pass precomputed anchors)

model = factory.build_retinanet(input_spec, cfg)

# Run inference on a batch of images [batch, H, W, 3]

outputs = model(images, training=False)

# Extract results

boxes = outputs['detection_boxes']         # [batch, max_detections, 4]

scores = outputs['detection_scores']       # [batch, max_detections]

classes = outputs['detection_classes']     # [batch, max_detections]

num_detections = outputs['num_detections'] # [batch]

```

## Exporting to SavedModel and TFLite

For production deployment, configure `ExportConfig` flags such as `output_normalized_coordinates=True` and `output_intermediate_features=False` in your config. The model supports TFLite-compatible post-processing ops injected during factory construction (lines 33-40 in [`factory.py`](https://github.com/tensorflow/models/blob/main/factory.py)).

```python
export_dir = '/tmp/retinanet_savedmodel'
model.save(export_dir, include_optimizer=False, signatures=None)

```

## Summary

- **RetinaNet** in TensorFlow Models is assembled via `factory.build_retinanet` using configuration dataclasses defined in [`official/vision/configs/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/configs/retinanet.py).
- The architecture combines a **backbone**, **FPN decoder**, **dual-purpose head**, **anchor generator**, and **detection generator** into a single `tf.keras.Model`.
- **Training** is managed by `RetinaNetTask`, which automatically handles focal loss, Huber loss, and dataset parsing via `retinanet_input.Parser`.
- **Inference** performs automatic anchor generation, box decoding, and NMS, returning ready-to-use detection boxes, scores, and class IDs.
- The implementation supports **export** to standard SavedModel and optimized TFLite formats for edge deployment.

## Frequently Asked Questions

### How does RetinaNet differ from Faster R-CNN in the TensorFlow Models repository?

RetinaNet is a one-stage detector that processes dense anchor boxes across pyramid levels in a single forward pass, while Faster R-CNN is a two-stage detector requiring a separate Region Proposal Network. According to the source code in [`official/vision/modeling/retinanet_model.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/retinanet_model.py), RetinaNet uses the `MultilevelDetectionGenerator` for post-processing rather than the RPN-based proposal mechanism found in Faster R-CNN implementations.

### Where is the focal loss implemented for RetinaNet training?

The focal loss implementation resides in `official/vision/losses`. The `RetinaNetTask` class aggregates this with Huber box regression loss in its `build_losses` method (lines 21-28 in [`official/vision/tasks/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/tasks/retinanet.py)). The task computes these losses automatically inside `train_step`, requiring no manual loss configuration in `model.compile()`.

### Can I modify anchor scales and aspect ratios without changing the source code?

Yes. Anchor parameters are controlled through the configuration dataclass in [`official/vision/configs/retinanet.py`](https://github.com/tensorflow/models/blob/main/official/vision/configs/retinanet.py). Adjust `cfg.anchor.num_scales`, `cfg.anchor.aspect_ratios`, and `cfg.anchor.anchor_size` before passing the config to `factory.build_retinanet`. The `Anchor` class in [`official/vision/ops/anchor.py`](https://github.com/tensorflow/models/blob/main/official/vision/ops/anchor.py) consumes these parameters to generate multiscale anchors for each FPN level.

### What backbones are supported for RetinaNet in this implementation?

The factory supports multiple backbones including ResNet, SpineNet, and MobileNet. In [`official/vision/modeling/factory.py`](https://github.com/tensorflow/models/blob/main/official/vision/modeling/factory.py), the `build_backbone` function instantiates the backbone based on `cfg.backbone.type`. You can configure the depth (e.g., ResNet-50 vs ResNet-101) via `cfg.backbone.resnet.depth` in your configuration object.