Painless NGINX Ingress (opens in new tab)

(danielfm.me)

177 pointsdanielmartins8y ago49 comments

49 comments

I think this blog post is one turn of the crank away from a truth we're all about to learn: don't hand roll your own Kubernetes ingress.

Dealing with the traffic handling between your users and your code is not a trivial problem. Like all good ops problems, you can fix it with good tools, deep knowledge of those tools, fine-grained observability, and smart people running all that.

This has been the recipe for a couple of really successful SaaS offerings. Individual servers? Datadog. CDN? Akamai / Fastly.

Disclaimer: I work at one of those companies, Turbine Labs, and we're trying to make ingress better. Here's a presentation from our CEO on Kubernetes ingress, and why the specification creates the problems that this blog post is trying to fix. https://www.slideshare.net/mobile/MarkMcBride11/beyond-ingre...

odammit8y ago

This is a great read. I know the single cluster for all env is something that is sort of popular but it's always made me uncomfortable for the reasons stated in the article but also for handling kube upgrades. I'd like to give upgrades a swing on a staging server ahead of time rather than go straight to prod or building out a cluster to test an upgrade on.

I tend to keep my staging and prod clusters identical, even names of services (no prod-web and stage-web, just web).

I'll set them up in different AWS accounts to clearly separate them and the only difference they have is the DNS name of the cluster and who can access them.

Edit: I suck at italicizing and grammar.

web0078y ago

+100 to this. Why would any sane Op/Inf/SRE choose not to have at least account-level isolation - is it only a matter of cost due to under-utilization?

I prefer to have everything 100% isolated for dev / qa / stage / prod, and have process and tooling in place to explicitly cross the streams. This comes from a history of pain with random dev-to-prod (or worse, prod-to-dev) access and dealing with "real companies" with things like audit requirements.

Having them separate lets you do things like @odammit suggests, upgrade your cluster in staging without affecting your developers or customers.

If you don't want to go that far, you can set up separate AWS accounts that are all tied together via an organization, and you can set up IAM roles and whatnot to share your API keys between accounts. That gives you at least some isolation, but still lets you GSD the same way as if you have a single account.

toomuchtodo8y ago

> and you can set up IAM roles and whatnot to share your API keys between accounts. That gives you at least some isolation, but still lets you GSD the same way as if you have a single account.

Do not do this. You are defeating the purpose of account level separation if you're sharing API keys between accounts. Each AWS environment should be totally segregated from the others (cross-account IAM permissions only if you must), limiting the blast radius in the event of human error or a malicious actor.

Source: Previously did devops/infra for 6 years, currently doing security

danielmartinsOP8y ago

> Why would any sane Op/Inf/SRE choose not to have at least account-level isolation - is it only a matter of cost due to under-utilization?

In our particular case, yes, pretty much. We are a small company with a small development team, so even if I would want to split accounts to different teams, we would end up having one account for 2-3 users, which doesn't make a lot of sense now.

danielmartinsOP8y ago

> This is a great read. I know the single cluster for all env is something that is sort of popular but it's always made me uncomfortable for the reasons stated in the article but also for handling kube upgrades. I'd like to give upgrades a swing on a staging server ahead of time rather than go straight to prod or building out a cluster to test an upgrade on.

I've been doing patch-level upgrades in-place since the beginning, and never had a problem. For more sensitive upgrades, this is what I do: create a new cluster using based on the current state in order to test the upgrade in a safe environment before applying it to production.

And for even more risky upgrades, I go blue/green-like by creating a new cluster with the same stuff running in it, and gradually shifting traffic to the new cluster.

hltbra8y ago

Cool read. I don't use Kubernetes but I learned a few things from this blog post that are applicable to my ECS environment.

The NGINX config part is tricky and it didn't come to mind that many programs will try to be smart about machine resources and it won't work in the container world as expected. This was a good reminder. OP didn't mention what Linux distro he's using and what are all of the OS-level configs he changed in the end of the day; I'd like to see that (was there any config not mentioned in the post?).

It's awesome that OP had lots of monitoring to guide him through the problem discovery and experimentation. I need more of this in my ECS setup. I didn't hop on the Prometheus train yet, by the way.

danielmartinsOP8y ago

> OP didn't mention what Linux distro he's using and what are all of the OS-level configs he changed in the end of the day.

I'm using Container Linux, and yes, I did a few modifications, but I intentionally left them out of the blog post as someone would be tempted to use them as-is.

I'll share more details in that regard if more people seem interested.

robszumski8y ago

I'd be interested to hear more.

hardwaresofton8y ago

Shameless plug! The insights in this article are pretty deep but if you're looking for just a clumsy step 1 to setting up the NGINX ingress controller on Kubernetes, check out what I wrote:

https://vadosware.io/post/serving-http-applications-on-kuber...

The most important thing that I found out while working on the NGINX controller was that you can just jump into it and do some debugging by poking around at the NGINX configuration that's inside it. There's no insight in there as deep as what's in this article, but for those that are maybe new to Kubernetes, hope it's helpful!

Thaxll8y ago

"Most Linux distributions do not provide an optimal configuration for running high load web servers out-of-the-box; double-check the values for each kernel param via sysctl -a."

This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the settings are good. The thing you don't want to do is start to modify the network stack by reading random blogs.

danielmartinsOP8y ago

> This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the settings are good. The thing you don't want to do is start to modify the network stack by reading random blogs.

I agree these are good defaults, but they are not meant to work well for all kinds of workloads. And yes, if things are working for you they way they are, that's okay; there's no need to change anything.

On the other hand, I personally don't know anyone who runs production servers of any kind on top of unmodified Linux distros.

tinix8y ago

> On the other hand, I personally don't know anyone who runs production servers of any kind on top of unmodified Linux distros.

You are so, so so lucky... lol. I say that as someone who has come across a desktop CentOS install on a server on multiple occasions, complete with running x-org and like 3-4 desktop environments to choose from, along with ALL of the extras. KDE office apps, Gnome's office apps, etc... HORRIBLE.

zrth8y ago

Sounds interesting! Do you have urls for more information about this? Would love to read good posts about that! My production servers have been running with standard parameters at every company so far. I feel I might be missing out!

manigandham8y ago

> high load web servers

Really? The distributions might work for the average site but high-load always requires tuning from the defaults on even the latest distros.

manigandham8y ago

NGINX also has their own ingress controller (in addition to the kubernetes community version): https://github.com/nginxinc/kubernetes-ingress

ultimoo8y ago

Great read!

>> "Let me start by saying that if you are not alerting on accept queue overflows, well, you should."

Does anyone know how to effectively keep a tab on this on a docker container running nginx open source? I have an external log/metrics monitoring server that could alert on this, but I'm asking more on the lines of how to get this information to the monitoring server.

foxylion8y ago

In this case (ingress controller) this is done with a Prometheus metric exporter. So all the metrics are available in Prometheus.

https://github.com/hnlq715/nginx-vts-exporter

zaroth8y ago

It sounded like there's a config directive to have Ingress Controller push all its metrics into Prometheus?

guslees8y ago

If it's helpful at all, here's a concrete example of a k8s nginx setup that exports-to/is-monitored-by prometheus: https://github.com/bitnami/kube-manifests/blob/master/common... (Start at https://engineering.bitnami.com/articles/an-example-of-real-... if you would prefer to approach that repo top-down)

zaroth8y ago

Am I correct in assuming that there is the Kube Service IP routing happening via iptables DNAT to get the request into the Kube running the Ingress Controller, and then the Ingress Controller is on top of that routing traffic to another Service IP which also has to go through the iptables DNAT?

danielmartinsOP8y ago

No. By default, the NGINX ingress controller routes traffic directly to pod IPs (the Service endpoints):

https://github.com/kubernetes/ingress/tree/master/controller...

zaroth8y ago

Thank you. So there is a DNAT to get to the Ingress Controller but from there at least it's direct routing to the service endpoint(s)? Does that mean the Virtual IP given to the Service is basically bypassed when using Ingress Controller?

TLS termination at the Ingress Controller and by default unencrypted from there to the service endpoint?

I found this useful: http://blog.wercker.com/troubleshooting-ingress-kubernetes

Interesting discussion here: https://github.com/kubernetes/ingress/issues/257

It seems like a lot of overhead before even starting to process a request!

danielmartinsOP8y ago

> TLS termination at the Ingress Controller and by default unencrypted from there to the service endpoint?

We are doing TLS termination at the ELB (we're running on AWS).

> Interesting discussion here: https://github.com/kubernetes/ingress/issues/257

Great, thanks!

Regarding ways of updating of the NGINX upstreams without requiring a reload, I was just made aware of modules like ngx_dynamic_upstream[1]. I'm sure there are other ways to address this in a less disruptive way than reloading everything, so this is probably something that could be improved in the future.

[1] https://github.com/cubicdaiya/ngx_dynamic_upstream

1 more reply

rjcaricio8y ago

Thanks for sharing your experience. I've got great insights to double check in my current environment.

Could you share which version of NGINX you found the issue with the reloads? Which version the fix was released?

PS.: I find it interesting/brave that you use a single cluster for several environments.

danielmartinsOP8y ago

> Could you share which version of NGINX you found the issue with the reloads? Which version the fix was released?

I'm using 0.9.0-beta.13. I first reported this issue in a NGINX ingress PR[1], so the last couple of releases are not suffering from the bug I reported in the blog post.

> I find it interesting/brave that you use a single cluster for several environments.

I'm not working for a big corporation, so dev/staging/prod "environments" are just three deployment pipelines to the same infrastructure.

As of now, things are running smoothly as they are, but I might as well use different clusters for each environment in the future.

[1] https://github.com/kubernetes/ingress/pull/1088

tostaki8y ago

Great read! Especially the part on ingress class which I didn't know about. Would you mind sharing some of your grafana dashboards?

mindfulmonkey8y ago

I still don't really understand the benefit of an Ingress controller versus just a Service > Nginx Deployment.

zimbatm8y ago

It's the most confusing part of Kubernetes IMO. It's a load-balancer with a very restricted feature set so what is it good for?

The main issue it tries to solve is how to get traffic from outside of the cluster to inside. The ingress resource is also supposed to be orthogonal to the ingress controller so that if your app is deployed on AWS or GCP (in practice it's not true though).

With the nginx ingress controller the main advantage I see is that you can share the port 80 on the nodes between multiple Ingress resources.

sandGorgon8y ago

ingress+overlay network confusion was the reason why we moved from k8s to Docker Swarmkit.

I still keep hoping for kubernetes kompose (https://github.com/kubernetes/kompose) to bring the simplicity of Docker Swarmkit to k8s.

Or will Docker Infrakit bring creeping sophistication first and eat kuberentes lunch ? (https://github.com/docker/infrakit/pull/601)

fulafel8y ago

Why does everyone use reverse proxies? It seems complex and inefficient. Why not serve xhr's and other dynamic content from the app server(s) and static content from a static webserver?

odammit8y ago

Off the top of my head: load balancing, hiding details of app servers, compressing responses and multivariate testing.

All of which could be done at the app server level sure, but then that would shift that complexity to your app and your developers.

Oh and job security, obviously.

fulafel8y ago

You could do all of those, except hiding app servers, with the client based technique I outlined in the nearby other comment. It would just be a tweak to the rule that the frontend uses to choose the the app server.

manigandham8y ago

That's a simplistic scenario and does not apply at all here. Kubernetes is a container orchestration platform that can run thousands of containers over thousands of compute nodes and directing traffic to them will require some sort of routing/proxy system.

fulafel8y ago

We already have routing systems for large numbers of nodes in the internet technology stack, it's not obvious to me why we another one on the HTTP layer.

manigandham8y ago

Many of those routing systems are proxies, and they can apply at any layer.

1 more reply

philipcristiano8y ago

What would you use to provide a single endpoint to multiple instances of an app server?

endorphone8y ago

There are scenarios where your app servers might be varied as well -- I've leveraged reverse proxies in front of a PHP application that had parts in .NET and parts in Go, for instance.

Technologies/competencies change as projects evolve, and being able to effortlessly reorganized and reroute is so profoundly powerful.

fulafel8y ago

Sure, I'm symphatetic to this kind of "in the trenches" application of reverse proxies - just not doing it by default.

fulafel8y ago

What about just exposing the multiple instances of app server, and have the frontend code select one for load balancing or failover purpouses? There could be a load balancing config read by the client, or you can have static rules in the frontend js, like choosing shard number based on a hash from the client ip address.

Round-robin DNS might also work or complement this.

manigandham8y ago

So you're answer to not using a reverse proxy is to fake your own via client-side logic? It's far better to have a tested, reliable, dynamic, and scalable solution right next to the actual app servers instead.

Almost everything on the internet is behind layers of proxies, it's not a bad thing and isn't much cause for concern.

1 more reply

j / k navigate · click thread line to collapse

49 comments

trjordan8y ago

I think this blog post is one turn of the crank away from a truth we're all about to learn: don't hand roll your own Kubernetes ingress.

This has been the recipe for a couple of really successful SaaS offerings. Individual servers? Datadog. CDN? Akamai / Fastly.

odammit8y ago

I tend to keep my staging and prod clusters identical, even names of services (no prod-web and stage-web, just web).

I'll set them up in different AWS accounts to clearly separate them and the only difference they have is the DNS name of the cluster and who can access them.

Edit: I suck at italicizing and grammar.

web0078y ago

+100 to this. Why would any sane Op/Inf/SRE choose not to have at least account-level isolation - is it only a matter of cost due to under-utilization?

Having them separate lets you do things like @odammit suggests, upgrade your cluster in staging without affecting your developers or customers.

toomuchtodo8y ago

> and you can set up IAM roles and whatnot to share your API keys between accounts. That gives you at least some isolation, but still lets you GSD the same way as if you have a single account.

Source: Previously did devops/infra for 6 years, currently doing security

danielmartinsOP8y ago

> Why would any sane Op/Inf/SRE choose not to have at least account-level isolation - is it only a matter of cost due to under-utilization?

danielmartinsOP8y ago

And for even more risky upgrades, I go blue/green-like by creating a new cluster with the same stuff running in it, and gradually shifting traffic to the new cluster.

hltbra8y ago

Cool read. I don't use Kubernetes but I learned a few things from this blog post that are applicable to my ECS environment.

It's awesome that OP had lots of monitoring to guide him through the problem discovery and experimentation. I need more of this in my ECS setup. I didn't hop on the Prometheus train yet, by the way.

danielmartinsOP8y ago

> OP didn't mention what Linux distro he's using and what are all of the OS-level configs he changed in the end of the day.

I'm using Container Linux, and yes, I did a few modifications, but I intentionally left them out of the blog post as someone would be tempted to use them as-is.

I'll share more details in that regard if more people seem interested.

robszumski8y ago

I'd be interested to hear more.

hardwaresofton8y ago

Shameless plug! The insights in this article are pretty deep but if you're looking for just a clumsy step 1 to setting up the NGINX ingress controller on Kubernetes, check out what I wrote:

https://vadosware.io/post/serving-http-applications-on-kuber...

Thaxll8y ago

"Most Linux distributions do not provide an optimal configuration for running high load web servers out-of-the-box; double-check the values for each kernel param via sysctl -a."

This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the settings are good. The thing you don't want to do is start to modify the network stack by reading random blogs.

danielmartinsOP8y ago

> This is not true, if you run Debian / CentOS7 / Ubuntu, out of the box the settings are good. The thing you don't want to do is start to modify the network stack by reading random blogs.

On the other hand, I personally don't know anyone who runs production servers of any kind on top of unmodified Linux distros.

tinix8y ago

> On the other hand, I personally don't know anyone who runs production servers of any kind on top of unmodified Linux distros.

zrth8y ago

manigandham8y ago

> high load web servers

Really? The distributions might work for the average site but high-load always requires tuning from the defaults on even the latest distros.

manigandham8y ago

NGINX also has their own ingress controller (in addition to the kubernetes community version): https://github.com/nginxinc/kubernetes-ingress

ultimoo8y ago

Great read!

>> "Let me start by saying that if you are not alerting on accept queue overflows, well, you should."

foxylion8y ago

In this case (ingress controller) this is done with a Prometheus metric exporter. So all the metrics are available in Prometheus.

https://github.com/hnlq715/nginx-vts-exporter

zaroth8y ago

It sounded like there's a config directive to have Ingress Controller push all its metrics into Prometheus?

guslees8y ago

zaroth8y ago

danielmartinsOP8y ago

No. By default, the NGINX ingress controller routes traffic directly to pod IPs (the Service endpoints):

https://github.com/kubernetes/ingress/tree/master/controller...

zaroth8y ago

TLS termination at the Ingress Controller and by default unencrypted from there to the service endpoint?

I found this useful: http://blog.wercker.com/troubleshooting-ingress-kubernetes

Interesting discussion here: https://github.com/kubernetes/ingress/issues/257

It seems like a lot of overhead before even starting to process a request!

danielmartinsOP8y ago

> TLS termination at the Ingress Controller and by default unencrypted from there to the service endpoint?

We are doing TLS termination at the ELB (we're running on AWS).

> Interesting discussion here: https://github.com/kubernetes/ingress/issues/257

Great, thanks!

[1] https://github.com/cubicdaiya/ngx_dynamic_upstream

1 more reply

rjcaricio8y ago

Thanks for sharing your experience. I've got great insights to double check in my current environment.

Could you share which version of NGINX you found the issue with the reloads? Which version the fix was released?

PS.: I find it interesting/brave that you use a single cluster for several environments.

danielmartinsOP8y ago

> Could you share which version of NGINX you found the issue with the reloads? Which version the fix was released?

I'm using 0.9.0-beta.13. I first reported this issue in a NGINX ingress PR[1], so the last couple of releases are not suffering from the bug I reported in the blog post.

> I find it interesting/brave that you use a single cluster for several environments.

I'm not working for a big corporation, so dev/staging/prod "environments" are just three deployment pipelines to the same infrastructure.

As of now, things are running smoothly as they are, but I might as well use different clusters for each environment in the future.

[1] https://github.com/kubernetes/ingress/pull/1088

tostaki8y ago

Great read! Especially the part on ingress class which I didn't know about. Would you mind sharing some of your grafana dashboards?

mindfulmonkey8y ago

I still don't really understand the benefit of an Ingress controller versus just a Service > Nginx Deployment.

zimbatm8y ago

It's the most confusing part of Kubernetes IMO. It's a load-balancer with a very restricted feature set so what is it good for?

With the nginx ingress controller the main advantage I see is that you can share the port 80 on the nodes between multiple Ingress resources.

sandGorgon8y ago

ingress+overlay network confusion was the reason why we moved from k8s to Docker Swarmkit.

I still keep hoping for kubernetes kompose (https://github.com/kubernetes/kompose) to bring the simplicity of Docker Swarmkit to k8s.

Or will Docker Infrakit bring creeping sophistication first and eat kuberentes lunch ? (https://github.com/docker/infrakit/pull/601)

fulafel8y ago

Why does everyone use reverse proxies? It seems complex and inefficient. Why not serve xhr's and other dynamic content from the app server(s) and static content from a static webserver?

odammit8y ago

Off the top of my head: load balancing, hiding details of app servers, compressing responses and multivariate testing.

All of which could be done at the app server level sure, but then that would shift that complexity to your app and your developers.

Oh and job security, obviously.

fulafel8y ago

manigandham8y ago

fulafel8y ago

We already have routing systems for large numbers of nodes in the internet technology stack, it's not obvious to me why we another one on the HTTP layer.

manigandham8y ago

Many of those routing systems are proxies, and they can apply at any layer.

1 more reply

philipcristiano8y ago

What would you use to provide a single endpoint to multiple instances of an app server?

endorphone8y ago

There are scenarios where your app servers might be varied as well -- I've leveraged reverse proxies in front of a PHP application that had parts in .NET and parts in Go, for instance.

Technologies/competencies change as projects evolve, and being able to effortlessly reorganized and reroute is so profoundly powerful.

fulafel8y ago

Sure, I'm symphatetic to this kind of "in the trenches" application of reverse proxies - just not doing it by default.

fulafel8y ago

Round-robin DNS might also work or complement this.

manigandham8y ago

Almost everything on the internet is behind layers of proxies, it's not a bad thing and isn't much cause for concern.

1 more reply

j / k navigate · click thread line to collapse