undefined | Better HN

0 pointsdoteka5y ago0 comments

But wouldn’t the point be that you don’t care about hardware level problems anymore? When I find a node with issues, I can just delete it from the pool and get a fresh one back. The bad network card? That’s for Google/Amazon/DigitalOcean to deal with.

I find the bog-standard Prometheus chart provides me a pretty incredible level of monitoring out of the box, usually it’s pretty easy to pick the bad one out of a graph.

Running your own VMs without something like k8s? Yeah this setup I can deploy and have working in an hour is gonna take you a week to set up properly. Standardization is valuable. Abstraction is valuable.

0 comments

ex_amazon_sde5y ago

> But wouldn’t the point be that you don’t care about hardware level problems anymore?

No. Read my post again: I did not wrote about hardware issues.

Most work around optimization, reliability or security require digging through the whole stack sometimes down to the kernel.

> When I find a node with issues, I can just delete it from the pool and get a fresh one back.

However, a lot of k8s deployments are on-premise, where you have to debug your own hardware.

> The bad network card? That’s for Google/Amazon/DigitalOcean to deal with.

First you have to pinpoint the root cause of that glitch affecting all the containers running on VMs using the same bad drivers. Often it could be the same in 50% of your fleet.

> is gonna take you a week to set up properly.

Most certainly not. I've been deploying large production fleets in minutes since 2005.

j / k navigate · click thread line to collapse