Terraform Data vs Resource: When to Use Data Sources and Managed Resources
Use a Terraform data source to read existing infrastructure you do not manage, and use a resource when Terraform must create, update, or destroy infrastructure that it owns.
The distinction between data sources and resources is fundamental to how HashiCorp Terraform models infrastructure. While both appear as blocks in your configuration, they serve opposite purposes in the Terraform state lifecycle and dependency graph. Understanding the terraform data vs resource distinction ensures you leverage Terraform's read-only queries without accidentally triggering destructive operations on existing cloud objects.
Core Architectural Differences
Terraform's provider protocol defines separate RPCs for these constructs. Resources use ApplyResourceChange while data sources invoke ReadDataSource, with schemas exposed under resource_schemas and data_source_schemas respectively in the provider metadata internal/tfplugin6/tfplugin6.pb.go【tfplugin6.pb.go†L25-L30】.
The fundamental differences break down as follows:
-
Management Goal: Resources create, update, or delete infrastructure that Terraform owns and manages. Data sources read information about objects outside Terraform's control, such as existing cloud objects or computed values.
-
Lifecycle: Resources maintain a full lifecycle (
create → read → update → delete). Terraform stores an ID, tracks drift, and performs destroy actions. Data sources have no lifecycle—they are read-only and Terraform invokes the provider'sReadmethod on every plan/apply without writing back to the provider. -
State Representation: Resource instances persist state entries with non-empty
idvalues. Data sources derive state on-the-fly; the implementation ininternal/legacy/helper/schema/resource.goinjects a placeholder ID ("-") if the provider does not set one, preserving the invariant that every object has an ID【resource.go†L97-L112】.
Dependency Graph and Side Effects
When the Terraform planner builds the dependency graph, it treats data source nodes as read-only vertices that have no external side effects. The dataDependsOn helper in internal/terraform/transform_reference.go only adds edges to managed resources, explicitly skipping data sources because they cannot cause side-effects【transform_reference.go†L26-L31】.
This architectural decision means data sources are ordered only by normal references and do not require depends_on declarations to enforce sequencing. Conversely, resources generate explicit graph edges for ordering and destroy-ordering, allowing other resources to depend on their side-effects.
Removal and Destruction Semantics
Resources can be destroyed via terraform destroy or by removing the block from configuration, triggering a full teardown of the underlying infrastructure. Data sources are never destroyed—removing a data source block only clears it from the state without affecting the underlying object【remove_target.go†L87-L88】.
This distinction is enforced in internal/addrs/remove_target.go, which validates that data sources cannot be targeted for removal operations because they lack destroy semantics.
When to Use a Data Source
Choose a data source when:
- You need information about an existing entity that you do not own (e.g., an AWS AMI, a DNS zone, a remote state output).
- The value is computed or derived from other resources (e.g.,
aws_subnet_idsbased on a VPC). - You only require read-only access and never intend to create, modify, or destroy the object via Terraform.
When to Use a Resource
Choose a resource when:
- Terraform should create the object, manage its attributes, and optionally destroy it later.
- You need full lifecycle control (create → update → delete).
- The object must be tracked in state so that drift detection and
taint/importoperations work.
Practical Configuration Examples
Reading Existing Infrastructure with a Data Source
The following configuration looks up an existing Ubuntu AMI without managing it:
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"] # Canonical
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t3.micro"
}
Terraform executes the aws_ami data source through the provider's ReadDataSource RPC, populating the id attribute without creating the AMI.
Creating Managed Infrastructure with a Resource
This resource block creates an EC2 instance that Terraform owns:
resource "aws_instance" "db" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.medium"
tags = {
Name = "Database"
}
}
Terraform stores the instance ID in state and can later destroy it when the block is removed.
Combining Data Sources and Resources
Data sources often feed computed values into resources:
data "aws_subnet_ids" "private" {
vpc_id = var.vpc_id
}
resource "aws_security_group" "allow_private" {
name = "allow_private"
description = "Allow traffic from private subnets"
ingress {
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = data.aws_subnet_ids.private.ids
}
}
Here aws_subnet_ids supplies a read-only list used by the security group resource, demonstrating the terraform data vs resource workflow.
Key Implementation Files
The architectural split between these constructs is visible in several core files:
-
internal/legacy/helper/schema/resource.go: ImplementsReadDataApplyand shows how data sources are built from scratch with placeholder ID handling【resource.go†L97-L112】. -
internal/terraform/transform_reference.go: Contains thedataDependsOnhelper that explains why data sources are treated as side-effect-free in the dependency graph【transform_reference.go†L26-L31】. -
internal/addrs/remove_target.go: Validates that data sources cannot be targeted for removal because they are never destroyed【remove_target.go†L87-L88】. -
internal/tfplugin6/tfplugin6.pb.go: Defines the provider metadata structure exposingDataSourceSchemasalongsideResourceSchemas【tfplugin6.pb.go†L25-L30】.
Summary
- Data sources are read-only queries that fetch existing infrastructure data without managing lifecycle; they use placeholder IDs and have no side effects.
- Resources manage the full CRUD lifecycle of infrastructure, persist state with real IDs, and generate dependency graph edges for ordering.
- Use data sources for lookups (AMIs, subnet IDs, remote states) and resources for objects Terraform must create, modify, or destroy.
- The planner in
transform_reference.gotreats data sources as pure reads, whileresource.gohandles their state construction differently from managed resources.
Frequently Asked Questions
Can a Terraform data source create or modify infrastructure?
No. Data sources are strictly read-only constructs that invoke the provider's ReadDataSource RPC. They cannot trigger create, update, or delete operations. According to the implementation in internal/legacy/helper/schema/resource.go, data sources are built from scratch on each read with no persisted state between operations【resource.go†L97-L105】.
What happens if I remove a data source block from my configuration?
Removing a data source block simply clears it from the Terraform state file. Unlike resources, data sources are never destroyed because they do not manage underlying infrastructure【remove_target.go†L87-L88】. The external object the data source referenced remains unaffected.
Should I use depends_on with data sources?
Generally no. Because data sources have no side effects, the planner in internal/terraform/transform_reference.go specifically excludes them from depends_on logic. They are ordered by normal reference dependencies only【transform_reference.go†L26-L31】. Explicit depends_on is unnecessary and ignored for data source sequencing.
How does Terraform handle drift detection for data sources versus resources?
Terraform tracks drift for resources by comparing the current state against the configuration and remote API responses, allowing it to detect manual changes. Data sources have no drift detection—they are recomputed from scratch during every plan/apply cycle, ensuring the values are always current but never "corrected" since they are read-only.
Have a question about this repo?
These articles cover the highlights, but your codebase questions are specific. Give your agent direct access to the source. Share this with your agent to get started:
curl -s "https://instagit.com/install.md" Maintain an open-source project? Get it listed too →