Written 26 July 2023 · Last edited 28 March 2026 ~ 10 min read
Setting up Terraform for AWS: what I got wrong, and how I fixed it
Benjamin Clark
Part 1 of 3 in Infrastructure as Code (IaC)
This post was originally written in July 2023. The setup it documented worked, but it was naive. I’ve substantially rewritten it to cover what I got wrong and what I built to replace it. The original code is preserved in the first section for context.
I’ve spent years setting up Terraform for other organisations’ AWS accounts. Proper multi-account setups with development, staging, and production separated at the account level. OIDC authentication so no long-lived credential ever touches a GitHub secret. Service control policies that enforce region restrictions and make certain mistakes structurally impossible. The kind of infrastructure that doesn’t fall apart when something goes wrong.
When I set this up for Sudoblark’s own account in 2023, I did none of that.
It worked. The infrastructure ran. But having since built the proper version, looking back at it is embarrassing — not because it was broken, but because I already knew better and didn’t apply it to my own account.
What I built in 2023
The goal was reasonable: a single repository managing Sudoblark’s AWS via Terraform, with GitHub Actions running plan on pull requests and apply on merge to main. Remote state in S3, basic CI/CD, done.
The structure was flat:
terraform.aws/
├── sudoblark/
│ ├── main.tf
│ ├── s3.tf
│ └── .terraform-version
└── .github/
└── workflows/
├── commit-to-pr.yaml
└── merge-to-main.yaml
main.tf pointed at an S3 bucket for state and used an AWS CLI profile for local development:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 3.0"
}
}
backend "s3" {
bucket = "terraform-sudoblark"
key = "environments/production/tfstate"
encrypt = true
region = "eu-west-2"
}
}
provider "aws" {
region = "eu-west-2"
profile = "sudoblark"
}
The CI/CD workflows passed long-lived IAM credentials as environment variables:
env:
AWS_ACCESS_KEY_ID: ${{ secrets.SUDOBLARK_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.SUDOBLARK_AWS_ACCESS_KEY_VALUE }}
AWS_DEFAULT_REGION: eu-west-2
The PR workflow ran terraform validate, tflint, and a plan. The main branch workflow applied. An S3 bucket got created. It worked.
What was wrong with it
Long-lived credentials
Storing IAM access keys as GitHub secrets is a common starting point. It shouldn’t be an endpoint. OIDC (OpenID Connect) authentication has been available in GitHub Actions since 2021 and takes an afternoon to set up — less time than you’d spend rotating a compromised key.
With OIDC, GitHub generates a short-lived token per workflow run. AWS verifies it against the registered OIDC provider and issues temporary STS credentials. There’s nothing to rotate, nothing to leak, and the credentials scope down to a specific role that can be constrained to specific repositories and branches via the OIDC sub claim.
The setup I had meant a static IAM access key existed somewhere — in a GitHub secret, on a developer machine, in a ~/.aws/credentials file. If it was ever exposed, the blast radius was the entire account. For my own account, used only by me, that felt like an acceptable trade-off. It wasn’t, really. OIDC is not hard to set up, and I’d already done it for clients.
Everything in one account
This is the part that bothers me most. One AWS account, one environment, one blast radius.
Separating environments into distinct AWS accounts isn’t something that only makes sense at scale — it’s the baseline. When development, staging, and production run in separate accounts:
- A mistake in development can’t touch production data or resources
- SCPs (service control policies) can apply different governance controls to different environments
- IAM boundaries between accounts are enforced by AWS, not by policy conventions you hope everyone follows
- Access can be granted at the account level rather than via increasingly complex IAM conditions
The excuse I gave myself was that Sudoblark is a one-person operation. But I’d set this up for solo consultants with smaller workloads. The barrier is never scale — it’s just the upfront effort of doing it properly.
No real tests
I used terraform validate and tflint. Validate confirms the syntax is correct. tflint catches common linting issues. Neither of them tests whether the configuration actually behaves the way you think it does.
To be fair, Terraform’s native test framework (terraform test) didn’t ship until v1.6 in October 2023 — a few months after this post was originally written. So that particular gap had an excuse at the time. What doesn’t have an excuse is that I still hadn’t added tests by the time I rebuilt the rest of this setup. terraform test lets you write .tftest.hcl files that exercise your modules with real or mocked providers. It’s been available for over two years. That’s the sore spot.
What I built instead
The current setup is split across three repositories:
sudoblark.terraform.aws.identity-management— the AWS Organisation, account structure, OUs, SCPs, IAM Identity Centre (AWS SSO), and OIDC bootstrapsudoblark.terrform.github— GitHub repository management via Terraformsudoblark.terraform.github.organisation— organisation-wide GitHub settings and enforcement rulesets
Each has a single bounded concern. A mistake in the GitHub repository manager can’t affect the AWS account structure, and vice versa.
OIDC instead of long-lived credentials
Every workflow that needs AWS access now uses the configure-aws-credentials action with OIDC. The key difference is the id-token: write permission, which lets the workflow request an OIDC token from GitHub:
permissions:
contents: read
pull-requests: write
id-token: write
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::<account-id>:role/service-accounts/aws-sudoblark-management-github-cicd-role
aws-region: eu-west-2
That token is sent to AWS, verified against the OIDC provider, and exchanged for temporary credentials scoped to the specified role. The role’s trust policy constrains which GitHub repositories and branches can assume it. No secrets in the repository. No rotation. Each run starts fresh.
The Terraform provider configuration reflects the same pattern — both the backend and the provider assume a role rather than using ambient credentials:
terraform {
required_version = "~> 1.14"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 6.0"
}
}
backend "s3" {
bucket = "aws-sudoblark-management-terraform-state"
key = "aws/aws-sudoblark-management/identity-management/terraform.tfstate"
encrypt = true
region = "eu-west-2"
assume_role = {
role_arn = "arn:aws:iam::<account-id>:role/aws-sudoblark-management-terraform-github-cicd-role"
session_name = "sudoblark.terraform.aws.identity-management"
external_id = "CI_CD_PLATFORM"
}
}
}
provider "aws" {
region = "eu-west-2"
assume_role {
role_arn = "arn:aws:iam::<account-id>:role/aws-sudoblark-management-terraform-github-cicd-role"
session_name = "sudoblark.terraform.aws.identity-management"
external_id = "CI_CD_PLATFORM"
}
}
The external_id (CI_CD_PLATFORM) is an extra verification step. The role’s trust policy requires it, which prevents other principals from assuming the role even if they obtained the ARN. It’s a small addition that closes an otherwise unnecessary attack surface.
A proper account structure
The identity management repository provisions a four-account AWS Organisation:
locals {
accounts = [
{
name = "aws-<org>-management"
email = "<root-email>"
},
{
name = "aws-<org>-development"
email = "<development-email>"
ou = "development"
},
{
name = "aws-<org>-staging"
email = "<staging-email>"
ou = "staging"
},
{
name = "aws-<org>-production"
email = "<production-email>"
ou = "production"
}
]
}
The management account is the Organisation root. It holds the Terraform state, the OIDC provider, and the CI/CD role. It doesn’t run workloads. Development, staging, and production each live in their own OU, which means SCPs can be targeted per-environment.
Setting up a Terraform-managed AWS Organisation has a chicken-and-egg problem: you need the management account and OIDC provider to exist before Terraform can manage them. The first time around, you create the S3 state bucket, the OIDC provider, and the initial CI/CD role manually. After that, Terraform takes over. It’s a one-time manual step, not a recurring one.
Service control policies
SCPs are permission boundaries that apply to every principal in an account regardless of what IAM policies are attached. They’re what makes governance actually enforceable rather than advisory.
The current setup has three:
locals {
security_control_policies = [
{
name = "DenyLeaveOrganization"
description = "Prevent accounts from leaving the org"
statements = [
{
Sid = "DenyLeaveOrg"
Effect = "Deny"
Action = ["organizations:LeaveOrganization"]
Resource = "*"
}
]
},
{
name = "AllowOnlyApprovedRegions"
description = "Deny actions outside approved regions"
statements = [
{
Sid = "DenyUnapprovedRegions"
Effect = "Deny"
NotAction = [
"iam:*", "organizations:*", "route53:*", "support:*",
"cloudfront:*", "acm:*", "logs:*", "kms:*",
"route53domains:*", "s3:GetBucketLocation",
"s3:ListAllMyBuckets", "aws-marketplace:*", "bedrock:*"
]
Resource = "*"
Condition = {
StringNotEquals = {
"aws:RequestedRegion" = ["eu-west-2"]
}
}
}
]
},
{
name = "ProtectCloudTrailAndConfig"
description = "Deny stopping or deleting CloudTrail"
statements = [
{
Sid = "DenyCloudTrailTamper"
Effect = "Deny"
Action = [
"cloudtrail:DeleteTrail",
"cloudtrail:StopLogging",
"cloudtrail:UpdateTrail"
]
Resource = "*"
}
]
targets = ["production"]
}
]
}
The region restriction is the most immediately useful. Every non-global service call that targets outside eu-west-2 is denied before IAM even evaluates it. The global services exemption list is the fiddly part — IAM, Route53, CloudFront, and a handful of others have global control planes that need explicit exemptions, otherwise things start failing in confusing ways that take time to diagnose.
ProtectCloudTrailAndConfig only applies to the production OU. In development, you sometimes need to clean up stale CloudTrail trails. In production, you don’t want that to be possible at all.
IAM Identity Centre
The management account runs IAM Identity Centre (AWS SSO). Instead of IAM users with passwords in each account, there’s a single portal. I authenticate once, pick an account and permission set, and get short-lived temporary credentials. Groups own the permission assignments — users are members of groups, groups are assigned to accounts with specific permission sets. Adding a new account to the Organisation automatically inherits the group assignments without any manual IAM user creation.
The GitHub side
The other two repositories manage GitHub itself via Terraform. sudoblark.terrform.github creates and configures every Sudoblark repository — branch protection, CODEOWNERS, visibility — with each domain area defined in its own configuration file. sudoblark.terraform.github.organisation handles the organisation settings and six enforcement rulesets that apply across all repositories: signed commits on main and staging, conventional commit messages, semantic versioning tags, PR approval and code owner review required.
These are organisation-level rulesets, not per-repository branch protections. They cannot be disabled at the repository level by anyone, including me.
The CI/CD pipeline now
The PR workflow for each of these repositories follows the same pattern: format check, validate (with OIDC credentials), Checkov security scan, terraform test, then plan. The plan result is posted back to the pull request. On merge to main, the apply runs.
What I’d tell someone starting today
Start with identity management, not with whatever you actually want to build.
Get the AWS Organisation structure right first — management account, and at minimum separate development and production OUs. Bootstrap OIDC before you write a single line of workload infrastructure. The bootstrap step (creating the OIDC provider and initial CI/CD role manually, then handing off to Terraform) takes an afternoon. It is not difficult. It is much less difficult than unpicking long-lived credentials from a running system later.
Don’t put everything in one repository. Separate account organisation from workload infrastructure. Separate GitHub management from AWS management. Each repository should have a clear, bounded scope, and a mistake in one should not be able to reach the others.
Write Terraform tests from the start. I still have gaps here — sudoblark.terrform.github doesn’t have full test coverage — but terraform test is native to Terraform now. There’s no excuse for leaving it out of a new project.
The 2023 version worked. But there’s a significant gap between working and well-designed, and that gap compounds over time.
Further reading
- Terraform OIDC with GitHub Actions
- AWS Organizations SCPs
- Terraform Test framework
- IAM Identity Centre
- dflook/terraform-github-actions
Part 1 of 3 in Infrastructure as Code (IaC)