> But wouldn’t the point be that you don’t care about hardware level problems anymore?
No. Read my post again: I did not wrote about hardware issues.
Most work around optimization, reliability or security require digging through the whole stack sometimes down to the kernel.
> When I find a node with issues, I can just delete it from the pool and get a fresh one back.
However, a lot of k8s deployments are on-premise, where you have to debug your own hardware.
> The bad network card? That’s for Google/Amazon/DigitalOcean to deal with.
First you have to pinpoint the root cause of that glitch affecting all the containers running on VMs using the same bad drivers. Often it could be the same in 50% of your fleet.
> is gonna take you a week to set up properly.
Most certainly not. I've been deploying large production fleets in minutes since 2005.