Question 1

How do you manage secrets in Terraform configurations safely?

Accepted Answer

Never hardcode secrets in `.tf`/`.tfvars` files or commit them to version control. Instead, pull secrets at apply time from a dedicated secrets manager (Vault, AWS Secrets Manager, Azure Key Vault) via a data source, or inject them as environment variables (`TF_VAR_db_password`) from your CI system's secret store. Mark the corresponding variable `sensitive = true` so Terraform redacts it from CLI/plan output (note this does **not** encrypt it in the state file, which still needs a secured, encrypted backend). Keep tfvars files containing real secrets out of git entirely, using `.gitignore` and per-environment secret injection instead.

Question 2

How do you integrate Terraform into a CI/CD pipeline safely?

Accepted Answer

A typical pipeline runs `terraform fmt -check` and `terraform validate` on every PR, then `terraform plan` and posts the plan as a PR comment for human review — never auto-applying straight from a PR. `apply` runs only on merge to the main branch (or via a manual approval gate), using a service principal/role with least-privilege access and a remote backend with locking so concurrent pipeline runs can't race. Pin the Terraform CLI and provider versions in CI to match local dev, and store the plan file as a build artifact so the exact reviewed plan is what gets applied ("plan then apply the same plan"), not a re-computed one.

Question 3

What is Terraform Cloud/Enterprise, and what problems does it solve over open-source Terraform?

Accepted Answer

Terraform Cloud (HCP Terraform) and Terraform Enterprise are HashiCorp's managed/self-hosted platforms built around the same core engine, adding: remote state storage with locking out of the box, remote/consistent plan-apply execution (so runs aren't tied to one engineer's laptop or a single CI runner), a private module/provider registry, policy-as-code enforcement (Sentinel/OPA) before apply, run history/audit logs, and role-based access control across teams and workspaces. Teams adopt it to replace a hand-rolled combination of an S3 backend, a CI pipeline, and ad-hoc access control with a single opinionated, governed platform.

Question 4

How do you test Terraform code?

Accepted Answer

Layered testing: `terraform validate` and `fmt -check` catch syntax/style issues fast in CI. `terraform plan` review (manual or automated diff-checking) catches unexpected resource changes before apply. For real correctness testing, `terraform test` (built-in, HCL-based) or Terratest (Go) actually `apply` configuration against real (or LocalStack-mocked) infrastructure, assert on outputs/resource properties, then tear it down — verifying the module truly provisions what it claims to. Static analysis/security scanners (`tflint`, `checkov`, `tfsec`) round this out by catching misconfigurations (open security groups, unencrypted storage) before they ever reach `plan`.

Question 5

What are common Terraform anti-patterns and best practices for large teams?

Accepted Answer

Anti-patterns: one giant monolithic state file for the entire org (a mistake in one team's resource can block or corrupt everyone's plan, and applies get slow); hardcoded values instead of variables/data sources; unpinned provider/module versions causing surprise breakage; secrets committed to `.tfvars`; manual console changes alongside Terraform-managed resources, causing drift. Best practices: split state per environment/service to limit blast radius, pin all versions, enforce `fmt`/`validate`/`plan` in CI before merge, use remote state with locking, keep modules small and composable, and require PR review for every `apply`-triggering change — treat infrastructure changes with the same rigor as application code.

Question 6

Explain the difference between Terraform and Terragrunt at a high level.

Accepted Answer

Terraform is the core IaC engine and language. Terragrunt is a thin wrapper around Terraform that adds features the core language doesn't have: DRY backend/provider configuration shared across many environments, automatic dependency ordering between separate Terraform modules/state files (`dependency` blocks), and convenient `run-all` commands to plan/apply many modules at once. Teams reach for Terragrunt when a multi-environment, many-module setup starts accumulating painful duplication in backend blocks and inter-module wiring that native Terraform (even with `for_each`/modules) doesn't cleanly solve — it's an additive tool on top of Terraform, not a replacement for it.

Question 7

How do you debug a failing or unexpected Terraform run — what do `terraform console`, `terraform graph`, `terraform show`, and `TF_LOG` give you?

Accepted Answer

`terraform console` opens a read-eval-print loop for evaluating arbitrary HCL expressions against the current state — useful for testing a tricky `for` expression or function call without a full `apply`. `terraform graph` emits the dependency graph in DOT format (renderable with Graphviz), helpful for visualizing why Terraform ordered operations the way it did. `terraform show` prints the current state (or a saved plan file) in human-readable or JSON form, useful for scripting or inspecting exactly what's recorded for a resource. For deeper issues — a provider crash, a mysterious hang — setting `TF_LOG=DEBUG` (or `TRACE` for maximum verbosity, optionally with `TF_LOG_PATH` to write to a file) surfaces the underlying provider RPC calls and HTTP requests Terraform Core is making, which is usually where the real root cause is found.

Question 8

What does the `-target` flag do, and why is it discouraged for routine use?

Accepted Answer

`terraform plan/apply -target=aws_instance.web` restricts Terraform to only that resource (and its dependencies), skipping everything else in the configuration — useful in genuine emergencies, like fixing one broken resource without waiting for an unrelated, slow, or currently-broken part of the plan to also process. It's discouraged as routine practice because it produces a plan that's *not* a full reconciliation of configuration against state — resources outside the target can silently drift further out of sync, and repeated reliance on `-target` is often a sign the configuration's state/blast-radius is too large and should be split into smaller, independently-applied units instead.

Question 9

How does Terraform compare to Pulumi and native IaC tools like CloudFormation/ARM templates?

Accepted Answer

CloudFormation (AWS) and ARM/Bicep (Azure) are **native, single-cloud** IaC tools maintained by the cloud provider itself — deep, immediate support for that cloud's newest features, but no multi-cloud story and typically YAML/JSON-based (Bicep and CDK aside). **Pulumi** takes a different approach from Terraform's: instead of HCL, you write actual general-purpose code (TypeScript, Python, Go, C#), giving you real loops, functions, and your language's own tooling/testing ecosystem, while still being multi-cloud and provider-plugin-based similarly to Terraform. Terraform sits in between: multi-cloud like Pulumi, but using a purpose-built declarative language (HCL) rather than a general-purpose one, with the largest, most mature ecosystem of community providers and modules. The practical choice often comes down to team preference (a dedicated IaC language vs. code in a language you already know) and whether multi-cloud portability matters versus staying deeply integrated with a single provider's native tooling.

Terraform in Production

How do you manage secrets in Terraform configurations safely?

What not to do

Better patterns

Remaining caveats

Interview-ready summary

Related Resources

How do you integrate Terraform into a CI/CD pipeline safely?

A typical pipeline shape

Key practices

Why this matters

Related Resources

What is Terraform Cloud/Enterprise, and what problems does it solve over open-source Terraform?

What open-source Terraform alone requires you to build

Pointing a configuration at Terraform Cloud

What Terraform Cloud/Enterprise adds out of the box

When teams reach for it

Related Resources

How do you test Terraform code?

Layer 1 — static checks (fast, run on every commit)

Layer 2 — security/best-practice linting

Layer 3 — plan review

Layer 4 — real integration testing

Why the layered approach matters

Related Resources

What are common Terraform anti-patterns and best practices for large teams?

Common anti-patterns

Best practices for large teams

Interview-ready summary

Related Resources

Explain the difference between Terraform and Terragrunt at a high level.

Terraform — the core engine and language

Terragrunt — a thin wrapper that reduces multi-environment boilerplate

When to reach for it

Interview-ready summary

Related Resources

How do you debug a failing or unexpected Terraform run — what do `terraform console`, `terraform graph`, `terraform show`, and `TF_LOG` give you?

terraform console

terraform graph

terraform show

TF_LOG environment variable

Putting it together

Related Resources

What does the `-target` flag do, and why is it discouraged for routine use?

What it does

Legitimate emergency use case

Why it's discouraged as routine practice

The right mental model

Related Resources

How does Terraform compare to Pulumi and native IaC tools like CloudFormation/ARM templates?

CloudFormation (AWS) and ARM/Bicep (Azure) — native, single-cloud

Pulumi — multi-cloud, general-purpose code

Terraform — multi-cloud, purpose-built language

How teams actually decide

Related Resources

`terraform console`

`terraform graph`

`terraform show`

`TF_LOG` environment variable