terraform-skill

Terraform 基础设施即代码最佳实践

查看详情
name:terraform-skilldescription:"Terraform infrastructure as code best practices"license:Apache-2.0metadata:author:"Anton Babenko"version:1.5.0source:"https://github.com/antonbabenko/terraform-skill"risk:safe

Terraform Skill for Claude

Comprehensive Terraform and OpenTofu guidance covering testing, modules, CI/CD, and production patterns. Based on terraform-best-practices.com and enterprise experience.

When to Use This Skill

Activate this skill when:

  • Creating new Terraform or OpenTofu configurations or modules

  • Setting up testing infrastructure for IaC code

  • Deciding between testing approaches (validate, plan, frameworks)

  • Structuring multi-environment deployments

  • Implementing CI/CD for infrastructure-as-code

  • Reviewing or refactoring existing Terraform/OpenTofu projects

  • Choosing between module patterns or state management approaches
  • Don't use this skill for:

  • Basic Terraform/OpenTofu syntax questions (Claude knows this)

  • Provider-specific API reference (link to docs instead)

  • Cloud platform questions unrelated to Terraform/OpenTofu
  • Core Principles

    1. Code Structure Philosophy

    Module Hierarchy:

    TypeWhen to UseScope
    Resource ModuleSingle logical group of connected resourcesVPC + subnets, Security group + rules
    Infrastructure ModuleCollection of resource modules for a purposeMultiple resource modules in one region/account
    CompositionComplete infrastructureSpans multiple regions/accounts

    Hierarchy: Resource → Resource Module → Infrastructure Module → Composition

    Directory Structure:

    environments/        # Environment-specific configurations
    ├── prod/
    ├── staging/
    └── dev/

    modules/ # Reusable modules
    ├── networking/
    ├── compute/
    └── data/

    examples/ # Module usage examples (also serve as tests)
    ├── complete/
    └── minimal/

    Key principle from terraform-best-practices.com:

  • Separate environments (prod, staging) from modules (reusable components)

  • Use examples/ as both documentation and integration test fixtures

  • Keep modules small and focused (single responsibility)
  • For detailed module architecture, see: Code Patterns: Module Types & Hierarchy

    2. Naming Conventions

    Resources:

    # Good: Descriptive, contextual
    resource "aws_instance" "web_server" { }
    resource "aws_s3_bucket" "application_logs" { }

    Good: "this" for singleton resources (only one of that type)


    resource "aws_vpc" "this" { }
    resource "aws_security_group" "this" { }

    Avoid: Generic names for non-singletons


    resource "aws_instance" "main" { }
    resource "aws_s3_bucket" "bucket" { }

    Singleton Resources:

    Use "this" when your module creates only one resource of that type:

    ✅ DO:

    resource "aws_vpc" "this" {}           # Module creates one VPC
    resource "aws_security_group" "this" {} # Module creates one SG

    ❌ DON'T use "this" for multiple resources:

    resource "aws_subnet" "this" {}  # If creating multiple subnets

    Use descriptive names when creating multiple resources of the same type.

    Variables:

    # Prefix with context when needed
    var.vpc_cidr_block # Not just "cidr"
    var.database_instance_class # Not just "instance_class"

    Files:

  • main.tf - Primary resources

  • variables.tf - Input variables

  • outputs.tf - Output values

  • versions.tf - Provider versions

  • data.tf - Data sources (optional)
  • Testing Strategy Framework

    Decision Matrix: Which Testing Approach?

    Your SituationRecommended ApproachToolsCost
    Quick syntax checkStatic analysisterraform validate, fmtFree
    Pre-commit validationStatic + lintvalidate, tflint, trivy, checkovFree
    Terraform 1.6+, simple logicNative test frameworkBuilt-in terraform testFree-Low
    Pre-1.6, or Go expertiseIntegration testingTerratestLow-Med
    Security/compliance focusPolicy as codeOPA, SentinelFree
    Cost-sensitive workflowMock providers (1.7+)Native tests + mockingFree
    Multi-cloud, complexFull integrationTerratest + real infraMed-High

    Testing Pyramid for Infrastructure

    /\
    / \ End-to-End Tests (Expensive)
    /____\ - Full environment deployment
    / \ - Production-like setup
    /________\
    / \ Integration Tests (Moderate)
    /____________\ - Module testing in isolation
    / \ - Real resources in test account
    /________________\ Static Analysis (Cheap)
    - validate, fmt, lint
    - Security scanning

    Native Test Best Practices (1.6+)

    Before generating test code:

  • Validate schemas with Terraform MCP:

  • Search provider docs → Get resource schema → Identify block types

  • Choose correct command mode:

  • - command = plan - Fast, for input validation
    - command = apply - Required for computed values and set-type blocks

  • Handle set-type blocks correctly:

  • - Cannot index with [0]
    - Use for expressions to iterate
    - Or use command = apply to materialize

    Common patterns:

  • S3 encryption rules: set (use for expressions)

  • Lifecycle transitions: set (use for expressions)

  • IAM policy statements: set (use for expressions)
  • For detailed testing guides, see:

  • Testing Frameworks Guide - Deep dive into static analysis, native tests, and Terratest

  • Quick Reference - Decision flowchart and command cheat sheet
  • Code Structure Standards

    Resource Block Ordering

    Strict ordering for consistency:

  • count or for_each FIRST (blank line after)

  • Other arguments

  • tags as last real argument

  • depends_on after tags (if needed)

  • lifecycle at the very end (if needed)
  • # ✅ GOOD - Correct ordering
    resource "aws_nat_gateway" "this" {
    count = var.create_nat_gateway ? 1 : 0

    allocation_id = aws_eip.this[0].id
    subnet_id = aws_subnet.public[0].id

    tags = {
    Name = "${var.name}-nat"
    }

    depends_on = [aws_internet_gateway.this]

    lifecycle {
    create_before_destroy = true
    }
    }

    Variable Block Ordering

  • description (ALWAYS required)

  • type

  • default

  • validation

  • nullable (when setting to false)
  • variable "environment" {
    description = "Environment name for resource tagging"
    type = string
    default = "dev"

    validation {
    condition = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be one of: dev, staging, prod."
    }

    nullable = false
    }

    For complete structure guidelines, see: Code Patterns: Block Ordering & Structure

    Count vs For_Each: When to Use Each

    Quick Decision Guide

    ScenarioUseWhy
    Boolean condition (create or don't)count = condition ? 1 : 0Simple on/off toggle
    Simple numeric replicationcount = 3Fixed number of identical resources
    Items may be reordered/removedfor_each = toset(list)Stable resource addresses
    Reference by keyfor_each = mapNamed access to resources
    Multiple named resourcesfor_eachBetter maintainability

    Common Patterns

    Boolean conditions:

    # ✅ GOOD - Boolean condition
    resource "aws_nat_gateway" "this" {
    count = var.create_nat_gateway ? 1 : 0
    # ...
    }

    Stable addressing with for_each:

    # ✅ GOOD - Removing "us-east-1b" only affects that subnet
    resource "aws_subnet" "private" {
    for_each = toset(var.availability_zones)

    availability_zone = each.key
    # ...
    }

    ❌ BAD - Removing middle AZ recreates all subsequent subnets


    resource "aws_subnet" "private" {
    count = length(var.availability_zones)

    availability_zone = var.availability_zones[count.index]
    # ...
    }

    For migration guides and detailed examples, see: Code Patterns: Count vs For_Each

    Locals for Dependency Management

    Use locals to ensure correct resource deletion order:

    # Problem: Subnets might be deleted after CIDR blocks, causing errors

    Solution: Use try() in locals to hint deletion order

    locals {
    # References secondary CIDR first, falling back to VPC
    # Forces Terraform to delete subnets before CIDR association
    vpc_id = try(
    aws_vpc_ipv4_cidr_block_association.this[0].vpc_id,
    aws_vpc.this.id,
    ""
    )
    }

    resource "aws_vpc" "this" {
    cidr_block = "10.0.0.0/16"
    }

    resource "aws_vpc_ipv4_cidr_block_association" "this" {
    count = var.add_secondary_cidr ? 1 : 0

    vpc_id = aws_vpc.this.id
    cidr_block = "10.1.0.0/16"
    }

    resource "aws_subnet" "public" {
    vpc_id = local.vpc_id # Uses local, not direct reference
    cidr_block = "10.1.0.0/24"
    }

    Why this matters:

  • Prevents deletion errors when destroying infrastructure

  • Ensures correct dependency order without explicit depends_on

  • Particularly useful for VPC configurations with secondary CIDR blocks
  • For detailed examples, see: Code Patterns: Locals for Dependency Management

    Module Development

    Standard Module Structure

    my-module/
    ├── README.md # Usage documentation
    ├── main.tf # Primary resources
    ├── variables.tf # Input variables with descriptions
    ├── outputs.tf # Output values
    ├── versions.tf # Provider version constraints
    ├── examples/
    │ ├── minimal/ # Minimal working example
    │ └── complete/ # Full-featured example
    └── tests/ # Test files
    └── module_test.tftest.hcl # Or .go

    Best Practices Summary

    Variables:

  • ✅ Always include description

  • ✅ Use explicit type constraints

  • ✅ Provide sensible default values where appropriate

  • ✅ Add validation blocks for complex constraints

  • ✅ Use sensitive = true for secrets
  • Outputs:

  • ✅ Always include description

  • ✅ Mark sensitive outputs with sensitive = true

  • ✅ Consider returning objects for related values

  • ✅ Document what consumers should do with each output
  • For detailed module patterns, see:

  • Module Patterns Guide - Variable best practices, output design, ✅ DO vs ❌ DON'T patterns

  • Quick Reference - Resource naming, variable naming, file organization
  • CI/CD Integration

    Recommended Workflow Stages

  • Validate - Format check + syntax validation + linting

  • Test - Run automated tests (native or Terratest)

  • Plan - Generate and review execution plan

  • Apply - Execute changes (with approvals for production)
  • Cost Optimization Strategy

  • Use mocking for PR validation (free)

  • Run integration tests only on main branch (controlled cost)

  • Implement auto-cleanup (prevent orphaned resources)

  • Tag all test resources (track spending)
  • For complete CI/CD templates, see:

  • CI/CD Workflows Guide - GitHub Actions, GitLab CI, Atlantis integration, cost optimization

  • Quick Reference - Common CI/CD issues and solutions
  • Security & Compliance

    Essential Security Checks

    # Static security scanning
    trivy config .
    checkov -d .

    Common Issues to Avoid

    Don't:

  • Store secrets in variables

  • Use default VPC

  • Skip encryption

  • Open security groups to 0.0.0.0/0
  • Do:

  • Use AWS Secrets Manager / Parameter Store

  • Create dedicated VPCs

  • Enable encryption at rest

  • Use least-privilege security groups
  • For detailed security guidance, see:

  • Security & Compliance Guide - Trivy/Checkov integration, secrets management, state file security, compliance testing
  • Version Management

    Version Constraint Syntax

    version = "5.0.0"      # Exact (avoid - inflexible)
    version = "~> 5.0" # Recommended: 5.0.x only
    version = ">= 5.0" # Minimum (risky - breaking changes)

    Strategy by Component

    ComponentStrategyExample
    TerraformPin minor versionrequired_version = "~> 1.9"
    ProvidersPin major versionversion = "~> 5.0"
    Modules (prod)Pin exact versionversion = "5.1.2"
    Modules (dev)Allow patch updatesversion = "~> 5.1"

    Update Workflow

    # Lock versions initially
    terraform init # Creates .terraform.lock.hcl

    Update to latest within constraints


    terraform init -upgrade # Updates providers

    Review and test


    terraform plan

    For detailed version management, see: Code Patterns: Version Management

    Modern Terraform Features (1.0+)

    Feature Availability by Version

    FeatureVersionUse Case
    try() function0.13+Safe fallbacks, replaces element(concat())
    nullable = false1.1+Prevent null values in variables
    moved blocks1.1+Refactor without destroy/recreate
    optional() with defaults1.3+Optional object attributes
    Native testing1.6+Built-in test framework
    Mock providers1.7+Cost-free unit testing
    Provider functions1.8+Provider-specific data transformation
    Cross-variable validation1.9+Validate relationships between variables
    Write-only arguments1.11+Secrets never stored in state

    Quick Examples

    # try() - Safe fallbacks (0.13+)
    output "sg_id" {
    value = try(aws_security_group.this[0].id, "")
    }

    optional() - Optional attributes with defaults (1.3+)


    variable "config" {
    type = object({
    name = string
    timeout = optional(number, 300) # Default: 300
    })
    }

    Cross-variable validation (1.9+)


    variable "environment" { type = string }
    variable "backup_days" {
    type = number
    validation {
    condition = var.environment == "prod" ? var.backup_days >= 7 : true
    error_message = "Production requires backup_days >= 7"
    }
    }

    For complete patterns and examples, see: Code Patterns: Modern Terraform Features

    Version-Specific Guidance

    Terraform 1.0-1.5


  • Use Terratest for testing

  • No native testing framework available

  • Focus on static analysis and plan validation
  • Terraform 1.6+ / OpenTofu 1.6+


  • New: Native terraform test / tofu test command

  • Consider migrating from external frameworks for simple tests

  • Keep Terratest only for complex integration tests
  • Terraform 1.7+ / OpenTofu 1.7+


  • New: Mock providers for unit testing

  • Reduce cost by mocking external dependencies

  • Use real integration tests for final validation
  • Terraform vs OpenTofu

    Both are fully supported by this skill. For licensing, governance, and feature comparison, see Quick Reference: Terraform vs OpenTofu.

    Detailed Guides

    This skill uses progressive disclosure - essential information is in this main file, detailed guides are available when needed:

    📚 Reference Files:

  • Testing Frameworks - In-depth guide to static analysis, native tests, and Terratest

  • Module Patterns - Module structure, variable/output best practices, ✅ DO vs ❌ DON'T patterns

  • CI/CD Workflows - GitHub Actions, GitLab CI templates, cost optimization, automated cleanup

  • Security & Compliance - Trivy/Checkov integration, secrets management, compliance testing

  • Quick Reference - Command cheat sheets, decision flowcharts, troubleshooting guide
  • How to use: When you need detailed information on a topic, reference the appropriate guide. Claude will load it on demand to provide comprehensive guidance.

    License

    This skill is licensed under the Apache License 2.0. See the LICENSE file for full terms.

    Copyright © 2026 Anton Babenko

      terraform-skill - Agent Skills