What causes configuration drift, and how do you detect and reconcile it?

5 minadvancedterraformdriftstate

Quick Answer

Drift happens when real infrastructure diverges from what's recorded in state — someone changes a setting in the cloud console, an auto-scaling process modifies a resource, or another tool touches the same object. Detect it by running `terraform plan` (or `terraform plan -refresh-only`) regularly/in CI, which refreshes state against real infrastructure and reports unexpected diffs. Reconcile it either by re-applying (Terraform overwrites the manual change back to desired state) or, if the manual change should be kept, by updating configuration to match reality and applying an `-refresh-only` apply to accept it.

Detailed Answer

Configuration drift is the gap between what Terraform's state file believes is true and what's actually running in the real infrastructure.

Common causes

  • A teammate manually edits a resource in the cloud console ("just this once, to fix production quickly").
  • An external automated process modifies a resource Terraform also manages (an autoscaler adjusting instance count, a security tool auto-remediating a misconfigured setting).
  • Another Terraform configuration or tool touches the same underlying resource.
  • A resource is deleted outside of Terraform (e.g., by a cleanup script or another engineer), so state still references something that no longer exists.

Detecting drift

terraform plan

By default, plan first refreshes its in-memory view of each managed resource by querying the provider's API, then diffs that refreshed data against configuration. If the console change altered something your configuration also specifies, plan reports an unexpected diff — e.g., "tags will be updated" when you didn't touch tags in code, which is a signal someone changed it manually.

For a refresh-only check that doesn't also propose config-driven changes:

terraform plan -refresh-only

This isolates just the drift (state vs. reality) from any intentional changes you've made in configuration, which is useful for scheduled drift-detection jobs in CI.

Reconciling drift

Two directions, depending on which side should "win":

  1. Terraform should win — re-run a normal apply. Terraform overwrites the manual change back to what configuration specifies, restoring the intended state.
  2. The manual change should be kept — update your .tf configuration to match the new reality, then run terraform apply -refresh-only to accept the drifted values into state without triggering an actual infrastructure change.

Why this matters operationally

Unmanaged drift erodes the entire premise of IaC — if the console can silently diverge from configuration, plan output stops being trustworthy. Mature teams run scheduled drift-detection (plan -refresh-only in a nightly CI job, alerting on any diff) and restrict console access precisely so Terraform-managed resources stay Terraform-managed.

Related Resources