undefined | Better HN

0 points__turbobrew__4mo ago0 comments

> Proxies in Asia need to know about that, right away, and this time you can't afford to wait.

Did you ever consider envoy xDS?

There are a lot of really cool things in envoy like outlier detection, circuit breakers, load shedding, etc…

0 comments

Nope. Talk a little about how how Envoy's service discovery would scale to millions of apps in a global network? There's no way we found the only possible point in the solution space. Do they do something clever here?

What we (think we) know won't work is a topologically centralized database that uses distributed consensus algorithms to synchronize. Running consensus transcontinentally is very painful, and keep the servers central, so that update proposals are local and the protocol can run quickly, subjects large portions of the network to partition risk. The natural response (what I think a lot of people do, in fact) is just to run multiple consensus clusters, but our UX includes a global namespace for customer workloads.

__turbobrew__OP4mo ago

I haven’t personally worked on envoy xds, but it is what I have seen several BigCo’s use for routing from the edge to internal applications.

> Running consensus transcontinentally is very painful

You don’t necessarily have to do that, you can keep your quorum nodes (lets assume we are talking about etcd) far enough apart to be in separate failure domains (fires, power loss, natural disasters) but close enough that network latency isn’t unbearably high between the replicas.

I have seen the following scheme work for millions of workloads:

1. Etcd quorum across 3 close, but independent regions

2. On startup, the app registers itself under a prefix that all other app replicas register

3. All clients to that app issue etcd watches for that prefix and almost instantly will be notified when there is a change. This is baked as a plugin within grpc clients.

4. A custom grpc resolver is used to do lookups by service name

tptacek4mo ago

I'm thrilled to have people digging into this, because I think it's a super interesting problem, but: no, keeping quorum nodes close-enough-but-not-too-close doesn't solve our problem, because we support a unified customer namespace that runs from Tokyo to Sydney to São Paulo to Northern Virginia to London to Frankfurt to Johannesburg.

Two other details that are super important here:

This is a public cloud. There is no real correlation between apps/regions and clients. Clients are public Internet users. When you bring an app up, it just needs to work, for completely random browsers on completely random continents. Users can and do move their instances (or, more likely, reallocate instances) between regions with no notice.

The second detail is that no matter what DX compromise you make to scale global consensus up, you still need reliable realtime update of instances going down. Not knowing about a new instance that just came up isn't that big a deal! You just get less optimal routing for the request. Not knowing that an instance went down is a very big deal: you end up routing requests to dead instances.

The deployment strategy you're describing is in fact what we used to do! We had a Consul cluster in North America and ran the global network off it.

2 more replies

justinparus4mo ago

The solutions across different BigCorp Clouds varies depending on the SLA from their underlying network. Doing this on top the public internet is very different than on redundant subsea fiber with dedicated BigCorp bandwidth!

otterley4mo ago

Lots of solutions appear to work in a steady-state scenario—which, admittedly, is most of the time. The key question is how resilient to failure they are, not just under blackout conditions but brownouts as well.

Many people will read a comment like this and cargo-cult an implementation (“millions of workloads”, you say?!) without knowing how they are going to handle the many different failure modes that can result, or even at what scale the solution will break down. Then, when the inevitable happens, panic and potentially data loss will ensue. Or, the system will eventually reach scaling limits that will require a significant architectural overhaul to solve.

TL;DR: There isn’t a one-size-fits-all solution for most distributed consensus problems, especially ones that require global consistency and fault tolerance, and on top of that have established upper bounds on information propagation latency.

hedgehog4mo ago

Is it actually necessary to run transcontinental consensus? Apps in a given location are not movable so it would seem for a given app it's known which part of the network writes can come from. That would require partitioning the namespace but, given that apps are not movable, does that matter? It feel like there are other areas like docs and tooling that would benefit from relatively higher prioritization.

tptacek4mo ago

Apps in a given location are extremely movable! That's the point of the service!

1 more reply

j / k navigate · click thread line to collapse

0 comments

tptacek4mo ago

__turbobrew__OP4mo ago

I haven’t personally worked on envoy xds, but it is what I have seen several BigCo’s use for routing from the edge to internal applications.

> Running consensus transcontinentally is very painful

I have seen the following scheme work for millions of workloads:

1. Etcd quorum across 3 close, but independent regions

2. On startup, the app registers itself under a prefix that all other app replicas register

3. All clients to that app issue etcd watches for that prefix and almost instantly will be notified when there is a change. This is baked as a plugin within grpc clients.

4. A custom grpc resolver is used to do lookups by service name

tptacek4mo ago

Two other details that are super important here:

The deployment strategy you're describing is in fact what we used to do! We had a Consul cluster in North America and ran the global network off it.

2 more replies

justinparus4mo ago

otterley4mo ago

hedgehog4mo ago

tptacek4mo ago

Apps in a given location are extremely movable! That's the point of the service!

1 more reply

j / k navigate · click thread line to collapse