Terraform Config Root Setups (opens in new tab)

(resourcely.io)

62 pointscarty71y ago52 comments

52 comments

We use a single codebase to deploy to multiple environments. The setup looks like this:

  - root
  -- /envs
  --- / dev.tfvars
  --- / prod.tfvars
  - main.tf

When it gets deployed by the CICD, the right tfvars file is passed in via the -var-file parameter. A standard `env` var is also passed in, and used as a basis for a naming convention. Backend is also set by the pipeline.

The rationale here is that our environments should be almost the same between them, and any variations should be accomplished by parameterization.

Modules are kept either in separate repos, if they need to be shared between many workspaces, or under the `modules` subfolder.

based21y ago

alt

    - root
     -- /envs
     --- / .dev.env
     --- / .test.env
     --- / .prod.env
     --- / dev.tfvars
     --- / test.tfvars
     --- / prod.tfvars
     - 1_create_network.tf
     - 2_create_storage.tf
     - 3_create_service.tf

https://www.youtube.com/watch?v=WgPQ-nm_ers Compliance At Scale: Hardened Terraform Modules at Morgan Stanley

https://www.weekly.tf/

carty7OP1y ago

Thanks for sharing. I’ll add the video to my queue.

FrenchyJiby1y ago

Yeah I'm confused why none of the solutions presented deploy the same TF across the environments: Surely if you have dev vs prod, 90% of dev infra is also needed in prod?

Then sure sprinkle a few per-env-toggles `count: 1 if env == dev else 0` (or whatever the latest nicest way to do that is today), but it feels weird to have to set up an artificial TF module around our entire codebase to be able to share the bulk of the code?

manfre1y ago

Instead of env conditionals, I strongly recommended feature/functionality conditionals or variables. E.g. var.create_s3 is better than var env == 'prod'

1 more reply

forty1y ago

In practice for complex env it ends up creating a lot of if(env) and the code ends up being horrible. At least it was our experience and we ended up having common modules instantiated in different environment as needed, which was much nicer

OJFord1y ago

This is terrible. All options seem to assume state is stored locally (not, say, in S3). Many options are just the same as others but only one environment, or with some project-specific difference like 'there is a backend and a frontend' or 'a few services' which has nothing to do with how you structure the terraform side of it.

All of them either don't address or use multiple directories to handle multiple environments (and assume a static list). What?! No, use terraform workspaces, or a single module that you instantiate for each environment. Or terragrunt if you really want. TFA is just a blogspam mess from either AI or someone that's spent about 20 minutes with a Youtube video to learn terraform & push out their latest deep dive content guys.

bloopernova1y ago

Agreed on the quality of the article, it feels very shallow.

carty7OP1y ago

Sorry it feels shallow and that you feel the need to troll.

If you read my first post related to this, I was giving myself a refresher to understand different dynamics that people think about.

I did not watch one YouTube video or spend 20 minutes on this or create with GPT.

The original source of inspiration came from me wanting to understand the examples our Eng team put together on how our config file correlates to what customers are actually using to find any gaps.

https://docs.resourcely.io/concepts/other-features-and-setti...

This is also a part 1 of the article and I clearly asked what was missing.

2 more replies

carty7OP1y ago

The entirety of this research was about structure of directories.

Storing state in S3 or TFC or Spacelift or somewhere else is out of scope. S3 is where 90% of the world stores their state and writing those configuration lines is not in scope. You can find other resources on that.

I struggled to find an exhaustive list of how people manage their directory structures and hence the focus of this piece.

If you’d like to provide constructive feedback and avoid comments regarding scope creep, please share.

OJFord1y ago

Sorry, I assumed this wasn't author-shared; I wouldn't have phrased it like that if I'd realised, or on a 'Show HN'.

It just struck me as a collection of ways you could split your configuration where none of them is something I would suggest. You could split your frontend/backend or services I guess sure, but then why are you assuming it's in the same repo, and why does that matter anyway, it has nothing to do with Terraform.

Why have dev & prod as different projects, assuming they're supposed to look the same? Use workspaces or different state or terra grunt. Ditto regions; use provider aliases.

If you say Well I'm not assuming they are the same, this is about structuring directories on the basis that they are already isolated Terraform configurations... Ok sure, but why does how you do that matter and what does it have to do with Terraform? It's the same as talking about structuring the directories for their application code isn't it?

TYMorningCoffee1y ago

Why do you think the article assumes statefile is local? I don't think that assumption was made.

OJFord1y ago

#2 splits the configuration into 'dev' & 'prod' duplicates, describes it as 'with separate state files', and from there they're all different ways of splitting the project for some reason.

If the state isn't local, then if you want that split in order to apply them separately, there would be no reason to physically separate & duplicate them into different directories.

lijok1y ago

This is ever evolving but we currently do;

Composable modules such as `terraform-aws-lambda/modules/standard-function`, `terraform-aws-iam/moduls/role-for-aws-lambda`, etc, which get composed for a specific usecase in a root module (which we call stacks). The stack has directories under it such as `dev/main/primary/` `dev/sandbox-a/primary/` `dev/sandbox-a/test-a/`, etc, where `dev` is environment, `main/sandbox-a` is tenant and the `primary/test-a` is the namespace. The namespaces contain a `tfvars` file and potentially some namespace specific assets, readme's, documentation, etc. The CD system then deploys the root module for each namespace present.

Stacks are then optionally (sometimes deeply) nested under parent directories, which are used for change control purposes, variable inference and consistency testing.

OpenTofu >1.8.0 is required for all of this to keep it nice and tidy.

carty7OP1y ago

Thank you for sharing your example.

dayallnash1y ago

Terragrunt goes a long way to providing a near-perfect way to structure Terraform repos.

spicyusername1y ago

I didn't understand the benefit of using terragrunt. Modern terraform supports all of the features terragrunt was originally designed to work around way back when.

Outside of being able to use variables in very niche places that you can't in terraform (and can easily work around, and that last I heard is on the road map for open tofu), what does terragrunt do that using regular module imports in terraform don't?

This may be anecdotal but every terragrunt repository I've ever seen was a mess of spaghetti trying too hard to stay DRY.

joshpadnick1y ago

I'm from Gruntwork; we're the maintainers of Terragrunt.

We get this question a lot because Terragrunt did in fact start as a feature shim for Terraform. But that was a long time ago and Terragrunt has evolved into a first-class "orchestration" tool for Terraform or OpenTofu.

We wrote a blog post addressing exactly your concern that you might find helpful: https://blog.gruntwork.io/terragrunt-opentofu-better-togethe....

slillibri1y ago

1) I use terragrunt to generate all the 'boilerplate' files, for example the backend configuration to ensure they are all in spec.

2) I use terragrunt to provide inherited values, such as region, environment, etc. I have a directory tree of `dev/us-west-2/` I can set a variable in my `dev/environment.hcl` that is inherited across everything under that environment. This is useful if you have more than one dev, prod, etc environment.

3) I use terragrunt to allow shared, versioned root modules. I don't include any terraform in my terragrunt repo. The terragrunt repo is just configuration. In my terragrunt.hcl I can reference something like `source = "github.com/example.com/root.git//ipv6-vpc?ref=v0.1.3"` to pull in the version of the root module I want. Again this is useful if you have multiple dev/stage/prod environments.

None of this is actually possible with plain terraform.

1 more reply

cube22221y ago

Not just on the roadmap!

With OpenTofu as of the latest release you can already use constant variables in places like module sources and versions, backend configurations, and even in order to for_each on providers!

Disclaimer: involved in OpenTofu

cyberpunk1y ago

Pain and suffering lies behind any attempts to make tf DRY. Unless you have a really insane amount of envs, I just copy/pasta these days. Fuck it. Life is too short.

carty7OP1y ago

I spend a lot of time speaking with clients and have found myself partially understanding organizational structure so I dove in to collect my thoughts and put myself closer to the customer on what they are navigating.

This gave myself a refresher on how they are organizing their cloud infrastructure within their source control systems. I took a lense from the world of terraform since that’s mostly the world i live in today and the last few years.

I explored 10 different ways to structure your Terraform config roots, each promising scalability but delivering varying degrees of chaos. From single-environment simplicity to multi-cloud madness, customers are stuck navigating spaghetti directories and state file hell.

I probably missed things. Might have gotten things wrong. Take a look and let me know what you think.

What patterns are you using that I missed?

jjayj1y ago

We are multi-cloud, multi-region, multi-environment, multi-deployment with hundreds of AWS accounts.

This is split over hundreds of microservice repositories, each of which maintains its own Terraform.

We don't read state from other Terraform deployments, and use published reusable modules when convenient and a tfvars file for every deployment.

At this point I can't imagine doing Terraform any other way.

unop1y ago

Nice! We'll link to this for our internal consultancy work.

It'd be nice to show the other dimension of the git branching strategies to apply. Github flow/feature-branches vs per-env branches of main vs git flow. How and when to apply changes in different environments - before vs after PRs, etc.

carty7OP1y ago

This was out of scope for my research. Have you seen any good resources on this?

bloopernova1y ago

The client I'm contracted to is all-in on Terraform Cloud. (TFC)

TFC uses workspaces, which annoyingly aren't the same thing as terraform workspaces. I've divided up our workspaces into dev, qa, staging, and prod, and each group of workspaces has the OIDC setup to allow management of a specific cloud account. So dev workspaces can only access the dev account, etc etc. Each grouping of workspaces also has a specific role that can access them. Each role then has its own API key.

The issues I've run into are mostly management of workspace variables. So now I have a manager repo and matching workspace that controls all the vars for the couple hundred TFC workspaces. I use a TFC group API key for the terraform enterprise provider, one provider per group. This prevents potential mistakes where dev vars could get written to qa, etc etc.

  Manager repo
  - dev TFE provider
  - qa TFE provider
  - staging TFE provider
  - prod TFE provider

Workspace variables are set by a single directory of terraform, so there's good sharing of the data and locals blocks.

I use lists of workspaces categorized by "pipeline deployers" and "application resource deployers", along with lists of dev, qa, staging, and prod workspaces. I then use terraform's "setintersection" function to give me "dev pipeline" workspaces, "prod app" workspaces, etc. I also do the same with groups of variables, as there's some that are specific to pipeline workspaces, and so on. It works well, and it's nice to have an almost 100% terraform control of vars and workspaces.

I split app and pipeline workspaces based on historical decisions, I'm not sure if I'd replicate that on a new project. The workflow there is that an app workspace creates the resources for a given deployment, then saves pertinent details to a couple of parameters. The pipeline workspace then pulls those parameters and uses them to create a pipeline that builds and deploys the code.

Unfortunately I can't share code from this particular setup, but I do intend to write about it "someday".

carty7OP1y ago

Thank you. I’ll dive into this when I have a chance next week.

solatic1y ago

This is all besides the point that Terraform's biggest weakness is refactoring large workspaces into multiple smaller workspaces. Transitioning IDs from one workspace to another, at scale, is annoying to say the least. The only remotely feasible generic solution here would be to treat statefiles as tables, write migrations as SQL, and use pre-existing tooling for database migrations and rollbacks... Maybe I'll write something like that someday.

Uvix1y ago

The addition of `import` and `removed` blocks make this a lot easier to manage than it was a year or two ago. You can manage the migration in the Terraform code rather than having to run separate state management commands.

bilekas1y ago

The method I go to almost always is

> Multi-Environment Setup with Shared Modules

But the con of saying versioning is tricky across modules is damn near impossible to reliably manage.. especially because if I'm introducing a new variable to a shared module A) I need to also add this variable in the inputs of each of the environment.

I haven't found a way to manage multiple versions of the modules across environments if all using the same shared modules. Is it even possible?

moredhel1y ago

I’ve always adopted the protobuf idea of growing an api interface.

Define a default which is backwards compatible.

carty7OP1y ago

Thank you for sharing. I’ve heard this module conundrum a few times and creates a proliferation of forked modules.

bilekas1y ago

I mean when it becomes a bit larger and harder to manage so many modules we would split them into standalone repos which could be versiones And deployed. Then in the tfstate file get the env arns for example, but the complexity in the pipeline and environment becomes a bit more difficult to maintain strictly. It's tough

NomDePlum1y ago

Good overview of potential options. Which is most appropriate really depends.

I am a big fan of modularisation, it is possible to extend this approach to divide logically your infrastructure and mirror that by separating out the terraform state files too.

Number of TF deployment increases but they each have smaller blast radiuses and you now need to manage making available outputs of builds to those which are dependent on them.

TYMorningCoffee1y ago

What's the craziest terraform repo you have seen? Is it okay to mix code (ex: Java) and terraform in a large mono repo?

dijit1y ago

I’ve seen things you wouldn't believe.

Python deterministically generating terraform HCL files based on yaml.

Execution wrappers that encapsulate terraform in CI/CD to parse the json output and prevent database deletion, but apply everything else.

Scripts that pull every git repo and execute every terraform file they can find while walking the directory tree.

Terraform is about 80% of the way to a good tool, that last 20% is a ball-ache and solved totally differently every time; the best setups I’ve seen is where terraform just “hands off” to something else after making a minimum infrastructure.

But, otherwise, it can get incredibly messy.

maccard1y ago

I agree with your 80/20 take, but I’ve come to the conclusion that the only reason terraform is a good tool is because it leaves the messy stuff (the other 20%) to other tools and pretends it doesn’t exist.

1 more reply

TYMorningCoffee1y ago

> Python deterministically generating terraform HCL files based on yaml.

That sounds terrible. I'm sorry you had to deal with that.

If they could go back in time, would modules have been good enough?

1 more reply

cruffle_duffle1y ago

Terraform is such a weird product and it’s hard to describe why, really. It has really awesome things about it like a functional way to pull in support for just about any providers crazy thing. And the crazy part is somehow terraform can easily configure local infrastructure, cloud infrastructure on almost any host, set up repos in GitHub, and then create a new auth provider in Auth0. The language is flexible enough to support all of it.

Yet as a language, it’s quirky as heck. For example how modules are basically wrappers on providers and how different modules can all most “see inside” other modules to iron out dependency ordering but yet also can’t. And speaking of, circular dependencies suck to work around in a modular way without tearing half your structure apart.

Like I said, I am not anywhere close to an expert on terraform and can only describe my limited experience building a fairly simple stack on top of it. The whole thing is just… both amazing and also weird and a bit frustrating. And I have yet to “grow” into multiple environments… lots of my complaints are probably down to my limited experience with it and, honestly, not much out there in terms of best practices for maintaining scalable configuration (or maybe my ADD brain refuses to dive into that, who knows?)

My last adventure into infrastructure as code was with Puppet and Salt. All of that was provisioning on top of bare metal. It was all file operations and the “provider specific modules” were really just wrappers to nicely encapsulate things like nginx or apt. Perhaps it is because of Puppet or Salt’s much more limited scope that didn’t have me feeling the same way.

I mean terraform can be used to configure just about anything that has an API if you wanted. Maintaining a declarative language around that is bound to have its quirks.

michaelmcmillan1y ago

Please, please, please use Terraform workspaces: One workspace per environment.

For environment specific things use conditionals:

  nodes = terraform.workspace == "prod" ? 2 : 1

j / k navigate · click thread line to collapse

52 comments

iliaxj1y ago

We use a single codebase to deploy to multiple environments. The setup looks like this:

  - root
  -- /envs
  --- / dev.tfvars
  --- / prod.tfvars
  - main.tf

The rationale here is that our environments should be almost the same between them, and any variations should be accomplished by parameterization.

Modules are kept either in separate repos, if they need to be shared between many workspaces, or under the `modules` subfolder.

based21y ago

alt

    - root
     -- /envs
     --- / .dev.env
     --- / .test.env
     --- / .prod.env
     --- / dev.tfvars
     --- / test.tfvars
     --- / prod.tfvars
     - 1_create_network.tf
     - 2_create_storage.tf
     - 3_create_service.tf

https://www.youtube.com/watch?v=WgPQ-nm_ers Compliance At Scale: Hardened Terraform Modules at Morgan Stanley

https://www.weekly.tf/

carty7OP1y ago

Thanks for sharing. I’ll add the video to my queue.

FrenchyJiby1y ago

Yeah I'm confused why none of the solutions presented deploy the same TF across the environments: Surely if you have dev vs prod, 90% of dev infra is also needed in prod?

manfre1y ago

Instead of env conditionals, I strongly recommended feature/functionality conditionals or variables. E.g. var.create_s3 is better than var env == 'prod'

1 more reply

forty1y ago

OJFord1y ago

bloopernova1y ago

Agreed on the quality of the article, it feels very shallow.

carty7OP1y ago

Sorry it feels shallow and that you feel the need to troll.

If you read my first post related to this, I was giving myself a refresher to understand different dynamics that people think about.

I did not watch one YouTube video or spend 20 minutes on this or create with GPT.

The original source of inspiration came from me wanting to understand the examples our Eng team put together on how our config file correlates to what customers are actually using to find any gaps.

https://docs.resourcely.io/concepts/other-features-and-setti...

This is also a part 1 of the article and I clearly asked what was missing.

2 more replies

carty7OP1y ago

The entirety of this research was about structure of directories.

I struggled to find an exhaustive list of how people manage their directory structures and hence the focus of this piece.

If you’d like to provide constructive feedback and avoid comments regarding scope creep, please share.

OJFord1y ago

Sorry, I assumed this wasn't author-shared; I wouldn't have phrased it like that if I'd realised, or on a 'Show HN'.

Why have dev & prod as different projects, assuming they're supposed to look the same? Use workspaces or different state or terra grunt. Ditto regions; use provider aliases.

TYMorningCoffee1y ago

Why do you think the article assumes statefile is local? I don't think that assumption was made.

OJFord1y ago

#2 splits the configuration into 'dev' & 'prod' duplicates, describes it as 'with separate state files', and from there they're all different ways of splitting the project for some reason.

If the state isn't local, then if you want that split in order to apply them separately, there would be no reason to physically separate & duplicate them into different directories.

lijok1y ago

This is ever evolving but we currently do;

Stacks are then optionally (sometimes deeply) nested under parent directories, which are used for change control purposes, variable inference and consistency testing.

OpenTofu >1.8.0 is required for all of this to keep it nice and tidy.

carty7OP1y ago

Thank you for sharing your example.

dayallnash1y ago

Terragrunt goes a long way to providing a near-perfect way to structure Terraform repos.

spicyusername1y ago

I didn't understand the benefit of using terragrunt. Modern terraform supports all of the features terragrunt was originally designed to work around way back when.

This may be anecdotal but every terragrunt repository I've ever seen was a mess of spaghetti trying too hard to stay DRY.

joshpadnick1y ago

I'm from Gruntwork; we're the maintainers of Terragrunt.

We wrote a blog post addressing exactly your concern that you might find helpful: https://blog.gruntwork.io/terragrunt-opentofu-better-togethe....

slillibri1y ago

1) I use terragrunt to generate all the 'boilerplate' files, for example the backend configuration to ensure they are all in spec.

None of this is actually possible with plain terraform.

1 more reply

cube22221y ago

Not just on the roadmap!

With OpenTofu as of the latest release you can already use constant variables in places like module sources and versions, backend configurations, and even in order to for_each on providers!

Disclaimer: involved in OpenTofu

cyberpunk1y ago

Pain and suffering lies behind any attempts to make tf DRY. Unless you have a really insane amount of envs, I just copy/pasta these days. Fuck it. Life is too short.

carty7OP1y ago

I probably missed things. Might have gotten things wrong. Take a look and let me know what you think.

What patterns are you using that I missed?

jjayj1y ago

We are multi-cloud, multi-region, multi-environment, multi-deployment with hundreds of AWS accounts.

This is split over hundreds of microservice repositories, each of which maintains its own Terraform.

We don't read state from other Terraform deployments, and use published reusable modules when convenient and a tfvars file for every deployment.

At this point I can't imagine doing Terraform any other way.

unop1y ago

Nice! We'll link to this for our internal consultancy work.

carty7OP1y ago

This was out of scope for my research. Have you seen any good resources on this?

bloopernova1y ago

The client I'm contracted to is all-in on Terraform Cloud. (TFC)

  Manager repo
  - dev TFE provider
  - qa TFE provider
  - staging TFE provider
  - prod TFE provider

Workspace variables are set by a single directory of terraform, so there's good sharing of the data and locals blocks.

Unfortunately I can't share code from this particular setup, but I do intend to write about it "someday".

carty7OP1y ago

Thank you. I’ll dive into this when I have a chance next week.

solatic1y ago

Uvix1y ago

bilekas1y ago

The method I go to almost always is

> Multi-Environment Setup with Shared Modules

I haven't found a way to manage multiple versions of the modules across environments if all using the same shared modules. Is it even possible?

moredhel1y ago

I’ve always adopted the protobuf idea of growing an api interface.

Define a default which is backwards compatible.

carty7OP1y ago

Thank you for sharing. I’ve heard this module conundrum a few times and creates a proliferation of forked modules.

bilekas1y ago

NomDePlum1y ago

Good overview of potential options. Which is most appropriate really depends.

I am a big fan of modularisation, it is possible to extend this approach to divide logically your infrastructure and mirror that by separating out the terraform state files too.

Number of TF deployment increases but they each have smaller blast radiuses and you now need to manage making available outputs of builds to those which are dependent on them.

TYMorningCoffee1y ago

What's the craziest terraform repo you have seen? Is it okay to mix code (ex: Java) and terraform in a large mono repo?

dijit1y ago

I’ve seen things you wouldn't believe.

Python deterministically generating terraform HCL files based on yaml.

Execution wrappers that encapsulate terraform in CI/CD to parse the json output and prevent database deletion, but apply everything else.

Scripts that pull every git repo and execute every terraform file they can find while walking the directory tree.

But, otherwise, it can get incredibly messy.

maccard1y ago

1 more reply

TYMorningCoffee1y ago

> Python deterministically generating terraform HCL files based on yaml.

That sounds terrible. I'm sorry you had to deal with that.

If they could go back in time, would modules have been good enough?

1 more reply

cruffle_duffle1y ago

I mean terraform can be used to configure just about anything that has an API if you wanted. Maintaining a declarative language around that is bound to have its quirks.

michaelmcmillan1y ago

Please, please, please use Terraform workspaces: One workspace per environment.

For environment specific things use conditionals:

  nodes = terraform.workspace == "prod" ? 2 : 1

j / k navigate · click thread line to collapse