The biggest difference I can see between CDK and pulumi (other than CDK only being for AWS) is that the CDK is more opinionated. When you spawn a new database, it'll automatically create a secret in secretsmanager, and set up rotation etc. And since it can assume IAM, it generates granular policies for you easily with calls like `dbInstance.grantRead(lambdaInstance)` etc, instead of you having to manually construct a JSON policy.
I really think the pulumi / CDK method of "Use a real programming language to generate a declarative spec" is the right way to go.
For those keeping score:
- chef/puppet: imperative language, imperative effects
- ansible: declarative language, imperative effects
- terraform: declarative language, declarative effects
- CDK/Pulumi: imperative language, declarative effects
Not to mention, CloudFormation actually allows ~transactions, which is something you can't really get without cooperation from the cloud provider
Edit: I incorrectly mentioned that terraform uses cloudformation to get transactions, but it does not
Terraform doesn't use CloudFormation on AWS (and I thought Pulumi used Terraform under the covers in some capacity?). I've also seen a lot of CloudFormation stacks get into completely unrecoverable states because AWS was trying to roll back a transaction, but the rollback failed. If you have a premium support contract, someone can un-stick it for you, but for the rest of us we just had to create a new stack. I've been off AWS for a year and change, so maybe this has improved?
In whichever case, I've only dabbled with CDK, but I was disappointed. What I really want is a better Troposphere[0]--sort of an AST library for CloudFormation, ideally type-safe. I don't care that the backend is CloudFormation in particular, but the idea is that we should have a clean separation between the backend diff engine and the abstraction layer that humans use to DRY the input to the backend diff engine.
Yes, you can recover a Cloudformation stack from any status nowadays. Ex: https://aws.amazon.com/premiumsupport/knowledge-center/cloud...
Level 1 is basically full parity to CloudFormation.
Level 2 has reasonable defaults, typically.
Level 3 has opinionated implementations.
At level 3, you can have a reasonable multi-az multi-subnet ECS cluster in 5 lines.. which is a good starting point to work from in many cases.
For those doing really nitty gritty stuff, you can still work at level 1, and make IaC in a familiar language better suited for programming logic than CF (or even ansible or terraform) are.
We've used raw CF for hundreds of thousands of deploys (VM stacks).. AWS SAM for thousands of our serverless infra deploys, and we've also used ansible and terraform for some of our overhead / management layers (managing IAM, mostly). we're really excited to move our raw CF templates into the CDK next year, our "conditions" and parameter blocks are getting annoyingly dense and difficult to follow.
Hopefully this will get better soonish on the pulumi side. [awsx](https://github.com/pulumi/pulumi-awsx) has existed for a while which is sort of takes the CFN higher level construct approach, but it's currently typescript only.
They just finished some foundational work to enable multi-language components, and I expect we'll see some opinionated/higher level components from them for all languages in the next 6 months or so.
Terraform and Pulumi use Cloudformation?
Again, this is incredibly powerful.
For those that don't know, cloudformation is now often thought as the assembly language of cloud development, with CDK the higher level language.
I work at Amazon, and here we have a growing library of internal CDK constructs that make creating internally facing infrastructure, that works with other infra, incredibly easy. Even the databases that other teams have, their queues, etc, can be vended as common infrastructure packages, and then consumed, attaching your own AWS resources to theirs via library imports.
I'm also concerned about it throwing out the lines of infrastructure lifecycle versus software lifecycle. Runs heavily afoul of the common mono repo criticism that, just because you can commit the infra update with the software, does not mean they magically deploy together.
Yeah funny you mention this, as the idea of vending infra resources as a common package runs into the issue of how to know if every consumer will be able to successfully ingest updates. From there you have the issue of "blast radius", how to handle rollbacks, etc. The complexity goes up, and now you have a problem. I think the answer is very sophisticated build systems, as well as CI/CD pipelines, which Amazon has. Upfront the team pushing changes can know if the deployment will work, before it gets to production. But this requires a huge amount of tooling, which I don't think most companies have. It also requires alot of cloud expertise.
"I guess the main question I have is what makes this different from any other abstraction?"
For one its all unit testable, with type completion (CDK is generally done in Typescript). The code has to compile, you can diff the new infra with the current infra and see whats changing. You have version control. You can use other design patterns from other teams, much like you would for OOP code, except now its infrastructure. A queue is just a variable in a CDK code base. It demotes infrastructure from a complex thing to manage, into something like a code variable.
Terraform was great, once, but over time as general cloud complexity got above a handful of some ec2 machines and some networking rules, it becomes a real burden to manage. Now all our devs are struggling to manage kenesis or ecs/fargate stuff with terraform. HCL is so close yet so far from an actual language that it's infuriating to use as a developer.
I think I'll bite the bullet and go all in on this, since the rest of our codebase is all ts anyway. Pulumi would be the other option, but at this point we're so sucked in to aws anyway, the only reason to use that over this (not locked to aws) is moot for us.
But CDK reads so beautifully and gets rid of so much noise in these templates that I don't care about. Unfortunately... the devops people and those who hold they keys to cloud resources at my current company (and many other companies) are so all in on Terraform that most won't even consider CDK/Pulumi as an option despite the CDK/Pulumi paradigm being objectively better than CloudFormation/Terrform paradigm.
The different libraries for the different services act extremely differently, there are frequently breaking changes (we did a minor point release, and now everything is broken)
The tooling doesn't support SSO, even though amazon has been pushing people in that direction for years.
I WANT to like it because you can see it is the way things should be, but until Amazon gets their documentation and tooling in a working state, it isn't nearly the beautiful thing it looks like it should be.
It is a great idea, but really let down by a shitty implementation.
Seriously. When you define a thing. Is it going to create it for you?
Is it going to fall over if it already exists because you ran the same script yesterday? So the thing already exists?
Is it just going to reference a thing which is already there?
It could be any of these, and how it acts is WILDLY different for every service, and also undocumented.
CDK could be good. But it REALLY isn't yet. There is a reason your Devops people are not falling over themselves to use it.
That's...unlikely.
> It could be any of these, and how it acts is WILDLY different for every service, and also undocumented.
CDK is an abstraction over CloudFormation, which is an inconsistent (in how it deals with similar things at the next level down) abstraction over the individual APIs of AWS services, which themselves are not particularly consistent to start with.
> There is a reason your Devops people are not falling over themselves to use it.
Well, lots are, because it's less tedious than raw CloudFormation or thin improvements over it like the Serverless transform, and AFAICT most of the inconsistency is from the underlying CF behavior, so isn't avoided by cutting out the additional layer.
My only gripe I have with it: even though you can synthesize raw CloudFormation templates from a CDK project using `cdk synth`, you can't upload the artifacts without running `cdk deploy` (so you can't actually use the synthesized templates to deploy, because the artifacts aren't there).
This is in contrast to SAM which does exactly this with the `sam package` command. Generates raw CloudFormation and uploads all assets to S3 in the right place.
This is one the main problems with most of the CDK-abstracted SDKs for clouds in general where you're essentially just going to re-implement Terraform or SaltStack or Ansible but with your own code that doesn't have the same portability in technical and human terms.
That knowledge about the in-house system is useless elsewhere, and anyone coming in from the outside can't use any pre-existing knowledge. This is of course only a problem in larger scopes, say a larger company with an internal team that does the Ops-leaning side of DevOps.
A company that is larger might simply delegate an entire set of accounts and infrastructure to individual teams where they have to sort everything out themselves, and a company that is smaller is essentially the same as a small division in a large company.
And then you still have the problem if the glue between your AWS cloud, Google cloud, Cloudflare and whatever Git provider you use. No CDK covers that the way something like Terraform with delegation to providers does where you have a standard data format where you can transport information between providers. If you want to create a repo in GitHub, preset some configuration and contents, add that repo to a CD solution that you run on Kubernetes on EKS in AWS with delegated accounts per EKS workload and then connect Cloudflare to ingress ALBs, that's at least 4 different APIs you're talking to with incompatible interfaces. Most of them have CDK's so your interface becomes your own implementation that you now have to maintain. Delegating that to a specialised tool works much better.
When I explored both, I found no way to "translate" an AWS VM into an azure VM, you have to use completely different modules and inputs.
Same for just about every module I could find... I see almost no benefit if you're working in one cloud to use a "platform agnostic" tool, if that platform agnostic tool uses platform-specific modules.
We tried terraform for making VPCs and Subnets for a simple 4-VM setup, and every module was AWS specific.
I tried it in Ansible and had the same issue
Ultimately we went with CloudFormation because, while it's not perfect either, it didn't break from minor-level module revisions on the community supported packages
At the same time, CloudFormation does nothing for Azure, so I don't understand where your comparison is coming from.
CloudFormation can only do AWS. Period. And unless you somehow manage to pin yourself to AWS in a greenfield scenario that means it will not be sufficient.
At the same time, CloudFormation is (in my opinion) a nasty way to describe desired state, and sharing anything (i.e. results) with anything else (i.e. non-AWS systems) is not possible, meaning you have to write something for that yourself.
If you are creating a single AWS account with a VPC, some subnets and a few EC2 instances, you're essentially just doing fake-cloud and you might as well do that in the console with no IaC whatsoever. That is not meant as some weird derogatory statement, but I'm not talking about managing a handful of resources in a single account here. I'm also not talking about a magic multicloud translator, but about IaC orchestration across multiple providers (terraform terminology applied).
If we take AWS out of the equation and look at something else, i.e. Cisco ACI, VMware NSX, Palo Alto, F5, those all have Terraform providers as well. No CloudFormation tho. So what if you're setting up a BGP peering over your DirectConnect, you gonna do the CloudFormation thing on the AWS side and manual configuration on the Palo Alto side? That would require at least 3 different CloudFormation stacks that you manually have to run in the right order.
The value of being multi-platform is that it lets you manage resources in multiple clouds with the same tool, and coordinate the changes. For example; it would let you have application instances and databases in GCP, and manage your DNS is AWS. When you make changes to the instances in GCP, the tool would know to update the DNS records in AWS.
It's also nice if you want to be able to manage resources in multiple AWS regions or accounts from a single stack.
CDK is bad at the core of its job; managing the state of resources in AWS. The two big ways this manifests with CDK are that you can't completely import existing resources into CDK, and CDK is bad at detecting changes made out-of-band.
CDK has a way to "import" resources, but unlike terraform, it's not something you do once then forget about. When you "import" a resource in CDK you are doing that as an alternative way to define the resource, so the code will always need to 'remember' which resources to create and which ones to import. Imported resources can be modified, but there are restrictions on what modifications can be made so they aren't really interchangeable with native cdk built resources. Our CDK stacks are littered with "if STAGE" blocks that will never go away until we recreate those resources with CDK, and there is no clean way to do that.
In general I would say making changes to CDK/terraform created resources out-of-band (ie, in the aws console) is a bad idea, but it happens sometimes. Maybe you aren't exactly sure how a parameter maps to a feature in the aws console, or there is an active outage and you need to make the change now, or the guy in IT that owns the root account decided to "fix" something for you. Whatever the cause, it's important to be able to run your IaC tool and 1) see the changes, 2) remediate them. CDK can detect some changes but not everything, and that's left me with very little confidence that things are actually setup correctly. No one's complaining, so I guess it's fine?
There are a few other quality-of-life things I don't like about CDK, but I don't think they are as important. It's slow. I don't like the diff output format. We always see diffs and different devs see different diffs, and we have no way to debug/resolve it. I'm boggled that a bunch of resource types (it's not clear which ones) default to not being deleted when you delete them from cdk. Overall, I get the feeling that "it works" and "no one's complaining" is the quality bar CDK is trying to meet, and that just isn't good enough for me.
The only reason I don't say terraform is strictly better than CDK is that terraform doesn't allow you to define your infrastructure with a 'real' programming language. I don't think that's an important feature, and it might tempt you into doing horrible things (like littering your codebase with "if STAGE" blocks). In general I think keeping infrastructure stuff simple and separate from application stuff is important, and HCL really encourages that. I've been bothered by the lack of programable logic in terraform, but honestly, I think that constraint has ultimately resulted in cleaner logic. If it's important to you, then pulumi is always there.
Ugh, I took way to long writing this. To answer your question, IMHO, don't use CDK.