My first thought is that the spikes are somewhat clearly the result of requests getting sent to pods that no longer exist, or are starting and not prepared to process requests. This might just speak to the method of configuration for all three of these underlying softwares and say absolutely nothing about how well they actually fare doing any load balancing.
If someone came to me with this at work, I would say it is the beginning of a series of troubleshooting steps to answer the question of why there are such outlying requests when using our load balancer of choice, and not an analysis of which software to pick.
Edit: Even worse is that this appears to be from a company that sells.. an API gateway built on top of Envoy.
Thanks for the feedback.
So regarding your hypothesis on the spikes being sent to pods that no longer exist/are starting: 1) it is the responsibility of the ingress controller on K8S to properly handle that situation 2) it would be highly unlikely for people to implement their own custom ingress controller around a given proxy (it's actually somewhat complicated) and 3) the pod theory wouldn't address the latency spikes seen on reconfiguration.
But you're right that there probably should be some explanation around why we think this is happening (I just didn't want to speculate too much; I suspect that the issue is with the hitless reloads implementation in the proxies which is tricky to do well).
[1]https://www.nginx.com/blog/microservices-reference-architect...
> We measure latency for 10% of the requests, and plot each of these latencies individually on the graphs.
So for what it's worth these spikes may very well be single requests that are not relevant and are only triggered by the way the Kubernetes cluster was being manipulated for the test.