Terraform Data vs Resource: When to Use Data Sources and Managed Resources

Use a Terraform data source to read existing infrastructure you do not manage, and use a resource when Terraform must create, update, or destroy infrastructure that it owns.

The distinction between data sources and resources is fundamental to how HashiCorp Terraform models infrastructure. While both appear as blocks in your configuration, they serve opposite purposes in the Terraform state lifecycle and dependency graph. Understanding the terraform data vs resource distinction ensures you leverage Terraform's read-only queries without accidentally triggering destructive operations on existing cloud objects.

Core Architectural Differences

Terraform's provider protocol defines separate RPCs for these constructs. Resources use ApplyResourceChange while data sources invoke ReadDataSource, with schemas exposed under resource_schemas and data_source_schemas respectively in the provider metadata internal/tfplugin6/tfplugin6.pb.go【tfplugin6.pb.go†L25-L30】.

The fundamental differences break down as follows:

  • Management Goal: Resources create, update, or delete infrastructure that Terraform owns and manages. Data sources read information about objects outside Terraform's control, such as existing cloud objects or computed values.

  • Lifecycle: Resources maintain a full lifecycle (create → read → update → delete). Terraform stores an ID, tracks drift, and performs destroy actions. Data sources have no lifecycle—they are read-only and Terraform invokes the provider's Read method on every plan/apply without writing back to the provider.

  • State Representation: Resource instances persist state entries with non-empty id values. Data sources derive state on-the-fly; the implementation in internal/legacy/helper/schema/resource.go injects a placeholder ID ("-") if the provider does not set one, preserving the invariant that every object has an ID【resource.go†L97-L112】.

Dependency Graph and Side Effects

When the Terraform planner builds the dependency graph, it treats data source nodes as read-only vertices that have no external side effects. The dataDependsOn helper in internal/terraform/transform_reference.go only adds edges to managed resources, explicitly skipping data sources because they cannot cause side-effects【transform_reference.go†L26-L31】.

This architectural decision means data sources are ordered only by normal references and do not require depends_on declarations to enforce sequencing. Conversely, resources generate explicit graph edges for ordering and destroy-ordering, allowing other resources to depend on their side-effects.

Removal and Destruction Semantics

Resources can be destroyed via terraform destroy or by removing the block from configuration, triggering a full teardown of the underlying infrastructure. Data sources are never destroyed—removing a data source block only clears it from the state without affecting the underlying object【remove_target.go†L87-L88】.

This distinction is enforced in internal/addrs/remove_target.go, which validates that data sources cannot be targeted for removal operations because they lack destroy semantics.

When to Use a Data Source

Choose a data source when:

  • You need information about an existing entity that you do not own (e.g., an AWS AMI, a DNS zone, a remote state output).
  • The value is computed or derived from other resources (e.g., aws_subnet_ids based on a VPC).
  • You only require read-only access and never intend to create, modify, or destroy the object via Terraform.

When to Use a Resource

Choose a resource when:

  • Terraform should create the object, manage its attributes, and optionally destroy it later.
  • You need full lifecycle control (create → update → delete).
  • The object must be tracked in state so that drift detection and taint/import operations work.

Practical Configuration Examples

Reading Existing Infrastructure with a Data Source

The following configuration looks up an existing Ubuntu AMI without managing it:

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"] # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

Terraform executes the aws_ami data source through the provider's ReadDataSource RPC, populating the id attribute without creating the AMI.

Creating Managed Infrastructure with a Resource

This resource block creates an EC2 instance that Terraform owns:

resource "aws_instance" "db" {
  ami           = "ami-0c55b159cbfafe1f0"
  instance_type = "t3.medium"

  tags = {
    Name = "Database"
  }
}

Terraform stores the instance ID in state and can later destroy it when the block is removed.

Combining Data Sources and Resources

Data sources often feed computed values into resources:

data "aws_subnet_ids" "private" {
  vpc_id = var.vpc_id
}

resource "aws_security_group" "allow_private" {
  name        = "allow_private"
  description = "Allow traffic from private subnets"

  ingress {
    from_port   = 0
    to_port     = 65535
    protocol    = "tcp"
    cidr_blocks = data.aws_subnet_ids.private.ids
  }
}

Here aws_subnet_ids supplies a read-only list used by the security group resource, demonstrating the terraform data vs resource workflow.

Key Implementation Files

The architectural split between these constructs is visible in several core files:

Summary

  • Data sources are read-only queries that fetch existing infrastructure data without managing lifecycle; they use placeholder IDs and have no side effects.
  • Resources manage the full CRUD lifecycle of infrastructure, persist state with real IDs, and generate dependency graph edges for ordering.
  • Use data sources for lookups (AMIs, subnet IDs, remote states) and resources for objects Terraform must create, modify, or destroy.
  • The planner in transform_reference.go treats data sources as pure reads, while resource.go handles their state construction differently from managed resources.

Frequently Asked Questions

Can a Terraform data source create or modify infrastructure?

No. Data sources are strictly read-only constructs that invoke the provider's ReadDataSource RPC. They cannot trigger create, update, or delete operations. According to the implementation in internal/legacy/helper/schema/resource.go, data sources are built from scratch on each read with no persisted state between operations【resource.go†L97-L105】.

What happens if I remove a data source block from my configuration?

Removing a data source block simply clears it from the Terraform state file. Unlike resources, data sources are never destroyed because they do not manage underlying infrastructure【remove_target.go†L87-L88】. The external object the data source referenced remains unaffected.

Should I use depends_on with data sources?

Generally no. Because data sources have no side effects, the planner in internal/terraform/transform_reference.go specifically excludes them from depends_on logic. They are ordered by normal reference dependencies only【transform_reference.go†L26-L31】. Explicit depends_on is unnecessary and ignored for data source sequencing.

How does Terraform handle drift detection for data sources versus resources?

Terraform tracks drift for resources by comparing the current state against the configuration and remote API responses, allowing it to detect manual changes. Data sources have no drift detection—they are recomputed from scratch during every plan/apply cycle, ensuring the values are always current but never "corrected" since they are read-only.

Have a question about this repo?

These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:

Share the following with your agent to get started:
curl -s "https://instagit.com/install.md"

Works with
Claude Codex Cursor VS Code OpenClaw Any MCP Client

Maintain an open-source project? Get it listed too →