Why does gRPC insist on trailers? (opens in new tab)

(carlmastrangelo.com)

324 pointsstrzalek3y ago125 comments

125 comments

> Whether it’s because I was wrong, or failed to make the argument [for HTTP trailers support], I strongly suspect organizational boundaries had a substantial effect. The Area Tech Leads of Cloud also failed to convince their peers in Chrome, and as a result, trailers were ripped out [from the WHATWG fetch specification].

FWIW, I personally think it's a good thing that other teams within Google don't have too much of an "advantage" for getting features into Chrome, compared to other web developers, however, I also think it's very unfortunate that a single Chrome engineer gets to decide not only that it shouldn't be implemented in Chromium, but that that also has the effect of it being removed from the specification. (The linked issue [1] was also opened by a Google employee.)

Of course, you might reasonably argue that, without consensus among the browsers to implement a feature, having it in the spec is useless. But nevertheless, with Chromium being an open source project, I think it would be better if it had a more democratic process of deciding which features should be supported (without, of course, requiring Google specifically to implement them, but also without, ideally, giving Google the power to veto them).

[1]: https://github.com/whatwg/fetch/issues/772

modeless3y ago

It's clear that the "single engineer" thing is a lie. Many engineers commented on the Chrome issue with opposing viewpoints, and even the original post describes it being escalated to tech leads on both sides, getting more people involved. I guarantee if it was only one person standing alone opposed to trailers then they would have been overruled. As you say, it's a good thing that Chrome resists adding the pet features of every other Google team to the web.

twiss3y ago

Sure, that's fair enough. But I'm not sure if characterizing this feature as a pet feature of the gRPC team is accurate either - after all, it's simply exposing an HTTP 1.1 & H2 feature, and it was already in the WHATWG fetch spec. There, the security concerns were apparently discussed as well [1], and adding the trailer headers as a separate object was deemed safe. I haven't read the entire discussion and don't have a vested interest in it, but the WHATWG spec seems like a better place to have this discussion, and come to a conclusion, than the Chromium issue tracker.

Apparently, there is a new issue for it, so that might yet happen: [2].

[1]: https://github.com/whatwg/fetch/issues/34

[2]: https://github.com/whatwg/fetch/issues/981

2 more replies

guenthert3y ago

Jeebus. Just because it's not true doesn't mean it's a lie.

1 more reply

sneak3y ago

Why should the decisionmaking process for Chromium be “democratic” simply because it is open source?

Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

If Google is paying for the implementors’ time, Google should have 100% say in what code they write. You and everyone else are free (thanks to Google’s generosity) to fork it at any point in the commit history and individually veto any specific change.

twiss3y ago

> Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

Leaving aside whether that's how it should work, I'm not sure if that's in fact how it works for Chromium today. If I write a high-quality patch adding support for trailers, will it get accepted? As I understand it, the answer is no. (But I would be happy to be wrong.)

So that's my main point: it would be good to have a democratic decision making process, not for what code Googlers should write, but for what patches would get accepted into Chromium. Not just because it's open source, but also because it's the basis not just of Google's browser, but a bunch of other browsers as well.

(And note that https://www.chromium.org/ seemingly aims to give the project an air of independence from Google. Thus, I'm merely questioning whether it is, in fact, independent, and arguing that it should be, if it isn't.)

Brian_K_White3y ago

The democratic process is that anyone who wants to pay for the ad campaign can try to convince everyone else that it's a good spec everyone should adopt, not merely pay a developer to code it.

If everyone else is not convinced then it should not become a thing no matter how much one party with money wants it.

fijiaarone3y ago

I wanted my road to swoop up and down like a roller coaster across the gorge but a single structural engineer on the bridge team overruled my obvious benefit.

joe_guy3y ago

I had never heard of HTTP trailers. So FYI

> The Trailer response header allows the sender to include additional fields at the end of chunked messages in order to supply metadata that might be dynamically generated while the message body is sent, such as a message integrity check, digital signature, or post-processing status.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Tr...

lloeki3y ago

I suppose even fewer people have heard that Transfer-Encoding: chunked supports chunk extensions, which allows one to supply arbitrary metadata without trailers.

https://datatracker.ietf.org/doc/html/rfc2616#section-3.6.1

Ever went to some site that generates compressed downloads or database exports on the fly, got no progress bar as a result, and were severely annoyed by that lack of feedback? I was, so I used chunk extensions to submit a draft to emit progress information dynamically:

https://datatracker.ietf.org/doc/html/draft-lnageleisen-http...

As noted at the end of the draft, this could be generalized and extended to have additional capabilities such as in flight integrity checks, or whatever you can think of.

k__3y ago

Haven't headers and trailers been renamed in the recent past?

akshayshah3y ago

Yes, slightly - RFC 9110 ("HTTP Semantics") calls them "header fields" and "trailer fields," and it calls "headers" and "trailers" colloquialisms. In a nod to gRPC-style usage, the section on trailer fields even says, "Trailer fields can be useful for supplying...post-processing status information."

https://www.rfc-editor.org/rfc/rfc9110.html#header.fields

https://www.rfc-editor.org/rfc/rfc9110.html#trailer.fields

thamer3y ago

A few years ago I worked on a service that had to stream data out using protobuf messages, in a single request that could potentially transfer several gigabytes of data. At the HTTP level it was chunked, but above that I used a protobuf message that contained data plus a checksum of that data, with the last message of the stream containing no data but a checksum of the entire dataset (a flag was included to differentiate between the message types).

This simple design led us to find several bugs in clients of this API (e.g. messages dropped or processed twice), and gave us a way to avoid some of the issues mentioned in this article. Even if you don't use HTTP trailers, you can still use them one layer above and benefit from similar guarantees.

vidarh3y ago

Inserting metadata in the protobuf itself seems like the obvious, simple solution to avoid having to depend on what the transport layer supports. Just defining a message to provide the metadata they wanted to insert in trailers would have avoided a whole lot of pain.

remram3y ago

> As an aside, HTTP/2 is technically superior to WebSockets. HTTP/2 keeps the semantics of the web, while WS does not.

WTF is this? Those are different layer protocols. WebSocket can run on top of HTTP/2.

It's like saying TLS is technically superior to TCP, or IP is superior to copper cables.

Reference: https://www.rfc-editor.org/rfc/rfc8441.html

Matthias2473y ago

The idea here is that http provides something like request/response semantics, methods, path, status code, etc - which are all also useful for gRPC. Websockets provide none of that - they are just message streams.

Websockets over http/2 are a new thing, and haven’t even been available at the time gRPC incepted.

the_mitsuhiko3y ago

I'm not sure if websockets over HTTP/2 are actually a new thing. Firefox implemented support years ago but it was disabled almost immediately afterwards because it doesn't work with proxies and has been disabled still. I think the only engine implementing it is Chrome.

As far as I know for HTTP/3 there is no way to use websockets yet.

stefan_3y ago

That is their whole point. That is why they exist. Dumb reliable pipe, please keep your silly semantics away.

1 more reply

rektide3y ago

For sure there are good reasons to abandon or find alternative resourceful protocols!

But in general, http is & could be the de-facto really good resourceful protocol. It's already 90% there.

Alas, the browser has been a major point of obstruction & difficulty & tension in making the most obvious most successful most clearly winning resourceful protocol at all better. The browser has sat on it's haunches & pissed around & prevented obvious & straightforward incremental growth that has happened everywhere else except the browser, such as with http trailers, such as with http2+ push. The browser has kept the de-facto resouceful protocol from developing.

You dont have to believe in http as the way to see what an oppressive & stupid shitshow this is. Being able to enhance the de-facto protocol of the web better should be in everyones interest, in a way that doesnt prevent alternatives/offshoots. But right now only alternatives & offshoots have any traction, because the browsers have all shot doen & rejected doing anything to support modern http 2+ in any real capacity. Their http inplementations are all frozen in time.

jayd163y ago

HTTP/2 provides features that websockets don't. Even if you were to use websockets over HTTP/2, you'd lose features like being able to multiplex requests _because_ it's a higher level protocol. Why is it wrong to say its better to use a more feature full and lower level protocol?

Thorrez3y ago

They provide different APIs. Websockets provide a bidirectional stream of messages. The messages in a given direction are always delivered in order. If they were to be suddenly reordered that would cause a lot of headaches.

While the individual messages can't be multiplexed, different websocket streams over a single HTTP/2 connection can be multiplexed.

I think websockets also provides a feature that HTTP/2 doesn't: the ability to easily push data from the server to browser javascript.

1 more reply

funny_falcon3y ago

TCP can't multiplex. HTTP/2 runs over TCP and does multiplexing.

WebSocket can't multiplex. Nothing prevents gRPC over WebSocket implement multiplexing itself.

2 more replies

remram3y ago

Again, you're saying fiber optics provides features that telephony doesn't. The fact that telephony usually runs over copper cables, and telephony over fiber optics is a recent thing, doesn't change the fact that the comparison makes no sense.

Telephony (websockets) runs over the copper cables or fiber optics (HTTP/1 or HTTP/2).

anyfoo3y ago

I’m not a web developer, but that RFC, which talks about bootstrapping, talks about using the CONNECT method to “transition” to the WebSockets protocol. Which matches what I thought the CONNECT method does: Switch to a protocol that is not HTTP?

But I only skimmed the introduction, did I miss something?

blibble3y ago

normally uses GET and the Upgrade header, not CONNECT

1 more reply

kevinmgranger3y ago

That's in a section specifically about picking the right transport. Per your example, it's like saying "TLS is technically superior to TCP, because it means our protocol can offload encryption and authentication to it".

jjtheblunt3y ago

> WebSocket can run on top of HTTP/2

Isn't "websocket" just a standard tcp socket, whose specification to instantiate it was born in a comparatively ephemeral HTTP (of whatever version) request, and which outlives the request, so isn't on top of anything other than tcp?

wmf3y ago

No, despite the name WebSockets are not plain TCP.

1 more reply

aaaaaaaaaaab3y ago

TCP is a stream of bytes. WebSockets are message-based.

chucky_z3y ago

From my perspective, I think the biggest issue with gRPC is it using HTTP/2. I understand that there’s a lot of reasons to say “No, HTTP/2 is far superior to HTTP/1.1.” However, in terms of proxying _outside Google_ HTTP/2 has lagged, and continues to lags at the L7 proxy layer. I recently performed a lot of high-throughput proxying comparing HAProxy, Traefik, and Envoy. HTTP/1.1 outperformed HTTP/2 (even H2C) by a pretty fair margin. Enough that if gRPC used HTTP/1.1 we could use noticeably less hardware. I could see this holding true even with a service mesh.

thayne3y ago

Also, http/2 over cleartext is not very well supported by a lot of things. Which is probably a good thing when going over the open internet. But it means you have to deal with setting up certificates even if just developing locally, and makes it more difficult to use for IPC on a single host.

mort963y ago

My preferred setup is to have an unencrypted service running on 127.0.0.1 (so not publicly available), and then have nginx in front to handle certificates. Lets me do all certificate stuff across all virtual hosts in one place. HTTP/2 makes this impossible due to its ridiculous TLS requirement, so I, and everyone who does it the way I do, must keep using HTTP/1.1 forever.

It's my belief that requiring TLS for HTTP/2 is what killed the protocol. It just causes too much friction during both development and deployment, for little to no (or negative) performance gain.

2 more replies

IshKebab3y ago

I agree. HTTP/2 is a huge requirement to push everywhere you want to use RPC. Want to do RPC to a microcontroller? Tough luck. Want to make RPC calls from a web page? Yeah have fun figuring out the two incompatible gRPC-web systems, setting up complicated proxies and actually finding a gRPC library that actually supports them fully.

Thrift has a much more sane design where everything is pluggable including the transport layer.

Bit of a shame that Thrift never became more popular.

arriu3y ago

Yup, and this is also why many end up proxy gRPC over HTTP/1.1 after giving up on making HTTP/2 work with systems that don’t support it…

Matthias2473y ago

> In this flow, what was the length of the /data resource? Since we don’t have a Content-Length, we are not sure the entire response came back. If the connection was closed, does it mean it succeeded or failed? We aren’t sure.

I don’t get that argument. GRPC uses length prefixed protobuf messages. It is obvious for the peer if a complete message (inside a stream or single response) is received - with and without trailers.

The only thing that trailer support adds is the ability to send an additional late response code. That could have been added also without trailers. Just put another length prefixed block inside the body stream, and add a flag before that differentiates trailers from a message. Essentially protobuf (application messages) in protobuf (definition of the response body stream).

I assume someone thought trailers would be a neat thing that is already part of the spec and can do the job. But the bet didn’t work out since browsers and most other HTTP libraries didn’t found them worthwhile enough to fully support.

afc3y ago

He offers two facts that I think explain this well enough:

> Additionally, they chose to keep Protobuf as the default wire format, but allow other encodings too.

And:

> Since streaming is a primary feature of gRPC, we often will not know the length of the response ahead of time.

These make sense; you'd enable servers to start streaming back the responses directly as they were generating them, before the length of the response could be known. Not requiring servers to hold the entire response can have drastic latency and memory/performance impact for large responses.

Thorrez3y ago

This doesn't match what I see in the gRPC spec. It says every message must be length-prefixed.

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2....

Disclaimer: I don't know much about gRPC.

1 more reply

zimpenfish3y ago

> It is obvious for the peer if a complete message (inside a stream or single response) is received

If I'm reading [1] correctly, you can't distinguish between [repeated element X is empty] and [message truncated before repeated element X was received] because "A packed repeated field containing zero elements does not appear in the encoded message." You'd need X to be the last part of the message but that's not a problem because "When a message is serialized, there is no guaranteed order [...] parsers must be able to parse fields in any order".

[1] https://developers.google.com/protocol-buffers/docs/encoding...

Thorrez3y ago

Yes, the protobuf format makes the end ambiguous, meaning the end needs to be indicated by the protocol containing the protobuf.

But it looks to me like the gRPC spec says that everything must be prefixed by a length at the gRPC layer. So then it doesn't matter that protobuf doesn't internally indicate the end, since the gRPC transport will indicate the end.

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2....

Disclaimer: I don't know much about gRPC.

1 more reply

alexcpn3y ago

I use GRPC between micro services instead of REST and for that it is really great; All the deficiencies of REST - Non versioned, no typed goes away with GRPC and the protobuf is the official interface for all micro-services. No problems with this approach for over two years now; and also multi language support - We have Go and Java and Python and TypeScript micro-services now happily talking and getting new features and new interface methods updated. Maybe it was demise in the web space; but a jewel in micro-service space

marcyb5st3y ago

This is more or less what stubby is/was for Google and so the original driving force in implementing it. Now, if you add a catch all service that translates the requests from the outside to Protobuffers and then forwards the translated requests to the correct service you have a GFE (Google Front-End) equivalent.

Should you do it? Probably not as it's not just a dumb translation layer and it is extremely complex (e.g. needs to support streams which is non-trivial in this situation). For Google it's worth because this way you only have to handle protobuffers beyond the GFE layer.

withinboredom3y ago

With GRPC, you lose the ability to introspect the data on-the-wire. You lose the ability to create optimized data formats for YOUR application (who said you have to use JSON?) Most people can’t implement REST correctly, so it has been a shitshow for the last 20 or so years, GRPC isn’t a magic bullet, it just forces you to solve problems (or helps you to solve them) that you should have been doing in the first place. You can do all of these things without GRPC, there is no power it grants you that can’t be done better or at all in your own libraries and specs.

ackfoobar3y ago

I suppose you mean "inspect" the data on-the-wire

https://grpc.io/blog/wireshark/

Wireshark can load proto files and decode the data for you.

BTW, "The Internet is running in debug mode".

rswail3y ago

Personal opinion: RPC is a failed architectural style, independent of what serialization/marshalling of arguments is used. it failed with CORBA, it failed with ONC-RPC, it failed with Java RMI.

Remote Procedure Calls attempt to abstract away the networked nature of the function and make it "look like" a local function call. That's Just Wrong. When two networked services are communicating, the network must be considered.

REST relies on the media type, links and the limited verb set to define the resource and the state transfer operations to change the state of the resource.

HTTP explicitly incorporates the networked nature of the server/client relationship, independent of, and irrespective of, the underlying server or client implementation.

Media types, separated from the HTTP networking, define the format and serialization of the resource representation independent of the network.

HTTP/REST doesn't really support streaming.

akshayshah3y ago

That's true of CORBA, for sure. I'm not familiar with ONC-RPC or Java RMI.

It's not true of gRPC. It's not "RPC" in any traditional sense - it's just a particular HTTP convention, and the clients reflect that. They're asynchronous, make deadlines and transport errors first-class concepts, and make it easy to work with HTTP headers (and trailers, as the article explains). Calling a gRPC API with a generated client often doesn't feel too different from using a client library for a REST API.

It's definitely a verb-oriented style, as opposed to REST's noun orientation. That's sometimes a plus, and sometimes a minus; it's the same "Kingdom of Nouns" debate [0] that's been going on about Java-style OOP for years.

0: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

rswail3y ago

The verb-oriented style is part of the problem. Too many verbs is the problem. Java/OOP problems are not the same as the REST style, which is entirely about Nouns. There's none of the Java ManagerFactoryManager problems.

The generated client from an IDL that wraps the network protocol with a function call is also part of the problem.

REST APIs that have function calls that aren't "Send Request and wait for Response" aren't REST. ("wait for response" doesn't imply synchronous implementation, but HTTP is a request/response oriented protocol).

1 more reply

atombender3y ago

CORBA and RMI are quite different from gRPC. (I have not used ONC-RPC.)

Both of those are explicitly centered on objects and locality transparency. The idea is that you get back references to objects, not mere copies of data. Those references act as if they're local, but the method calls actually translate into RPC calls, as the local reference is just a "stub". Objects can also contain references, meaning that you are working on entire object graphs, which can of course be cyclical, too.

These technologies (as well as Microsoft's DCOM) failed for many reasons, but it was in part because pretending remote objects are local leads to awful performance. Pretty magical and neat, but not fast. I built a whole clustered system on DCOM back in the late 1990s, and it was rather amazing, but we were also careful to not fall into the traps. One of the annoying bugs you can create is to accidentally hold on to a reference for too long in the client (by mis-implementing ref counting, for example); as long as you have a connection open to the server, this creates a server memory leak, because the server has to keep the object alive for as long as clients have references to them.

Ultimately, gRPC and other "simple" RPC technologies like Thrift are much easier to reason about precisely because they don't do this. An RPC call is just passing data as input and getting data back as output. It maps exactly to an HTTP request and response.

As for REST, almost nobody actually implements REST as originally envisioned, and APIs today are merely "RESTful", which just means they try to use HTTP verbs as intended, and represent URLs paths that map as cleanly to the nouns as possible. But I would argue that this is just RPC. Without the resource-orientation and self-describability that comes with REST, you're just doing RPC without calling it RPC.

I don't believe in REST myself (and very few people appear to, otherwise we'd have actual APIs), so I lament the fact that we haven't been able to figure out a standard RPC mechanism for the web yet. gRPC is great between non-browser programs, mind you.

rswail3y ago

I don't "believe" in REST, except that when you do it "properly" with media types and links, and proper thought about the resources you identify and what the state transitions are, it all works very nicely as a request/response API style.

The difference is during the design phase, where you focus on those resources and their state, instead of the process for changing that state.

jiggawatts3y ago

Something that’s always bugged me about streaming protocols of this type is that they prevent processing pipelining.

If trailers are used for things such as checksums, then the client must wait patiently for potentially gigabytes of data to stream to it before it can verify the data integrity and start processing it safely.

If the data is sent chunked, then this is not an issue. The client can start decoding chunks as they arrive, each one with a separate checksum.

jmillikin3y ago

This comment is mixing up a few concerns.

1. When transferring large amounts of data, the checksum for the full transfer can't be verified until all the data is received. If you want to (for example) download an Ubuntu ISO and verify its checksum before installing it, you'll have to buffer the data somewhere until the download finishes.

2. When transferring small amounts of data, such as individual chunks, the data integrity is (/ should be) automatically verified by the encryption layer[0] of the underlying transport. There's no point in putting a shasum into each chunk because if a bit gets flipped in transit then that chunk will never even arrive in your message handler.

3. In gRPC, chunking large data transfers is mandatory because the library will reject Protobuf messages larger than a few megabytes[1]. As the chunks of data arrive, they can be passed into a hash function at the same time as you're buffering them to disk.

[0] gRPC supports running without encryption for local development, but obviously for real workloads you'd do end-to-end TLS.

[1] IIRC the default for C++ and Go implementations of gRPC is 4 MiB, which can be overridden when the client/server is being initialized. For bulk data transfer there's also the Protobuf hard limit of 2GB[2] for variable-length data.

[2] https://developers.google.com/protocol-buffers/docs/encoding

crest3y ago

I would argue that RPC systems shouldn't be burdened down with features like this. If you want to exchange more date than either endpoint can buffer comfortably in memory split it up into several RPC messages e.g. create large object, define byte range, seal object. If the producer can precompute the length and hash you can simplify things by using content addressing to reference such objects and replicate them. If you can afford the overhead breaking them up into Merkle DAGs reasonably sized leaves is a good idea to allow validating and resuming partial transfers. It matters if your devices are connected via PCIe in a single chassis or mobile networks spread over half the world and datacenter optimised protocol won't be optimal for expensive, slow, unreliable links.

throwaway293033y ago

Interesting read.

  As an aside, HTTP/2 is technically superior to WebSockets. HTTP/2 keeps the semantics of the web, while WS does not. Additionally, WebSockets suffers from the same head-of-line blocking problem HTTP/1.1 does.

Not really a fair comparison. WebSockets is essentially a bidirectional stream of bytes without any verbs[0] or anything fancy. WebSockets is more like a fancy CONNECT.

And speaking of bidirectional stream of bytes... HTTP/2 also suffers from head-of-the-line blocking as well, it uses TCP as its substrate, after all. QUIC, however, despite sharing some ideas from TCP, it seems to ameliorate this by resorting to multipaths[1]. It remains to be seen if indeed this is going to be beneficial however.

[0] - unless you count its opcode field as something similar to HTTP verbs but if so it'd resemble more TCP than HTTP, I think

[1] - https://datatracker.ietf.org/doc/html/draft-ietf-quic-multip...

mdriley3y ago

It seems like a lot of other technologies in this space have solved the listed problems while remaining compatible with browsers, load balancers, reverse proxies, etc.

It was a product choice not to offer a fallback path when HTTP/2 was unavailable. That choice made gRPC impossible to deploy in a lot of real-world environments.

What motivated that choice?

lakomen3y ago

Can you name one or more?

mdriley3y ago

one relevant datapoint: https://devblogs.microsoft.com/dotnet/announcing-grpc-json-t...

game-of-throws3y ago

> Why Do We Need Trailers At All?

The author convinced they're needed. But I wonder if some sort of error signaling should have been baked into `Transfer-Encoding: chunked` instead. It wouldn't have made sense in HTTP/1.1 since you can just close the connection. But in later HTTP versions with pipelined requests, I can see the use for bailing on one request while keeping the rest alive.

adev_3y ago

> The author convinced they're needed. But I wonder if some sort of error signaling should have been baked into `Transfer-Encoding: chunked` instead. It wouldn't have made sense in HTTP/1.1 since you can just close the connection. But in later HTTP versions with pipelined requests, I can see the use for bailing on one request while keeping the rest alive.

It did not to me.

I would rephrase the argumentation as:

- In HTTP/2 we thought to be smart by multiplexing multiple HTTP transactions over a single TCP connection.

- Shit we realized later that HTTP/1.1 did not necessitate trailers because they could abort the connection and we can not afford to do that anymore, we are multiplexed now.... Shit, we do need trailers now.

- That is currently a good example of good intentions inducing complexity. And complexity inducing even more complexity for free.

- HTTP/2 is now rolled over the world and everybody has to deal with that

- Still HTTP/2 suffers of several problems completely ignored by the blog post (Like the Head-of-Line blocking problem: it is not solved in HTTP/2). The result is now QUIC + HTTP/3 and we all start over again.

Thorrez3y ago

>Shit we realized later that HTTP/1.1 did not necessitate trailers because they could abort the connection and we can not afford to do that anymore

This is an interesting point, but I don't think it's correct. The HTTP/2 spec allows you to send a RST_STREAM frame to indicate that an individual stream had a problem. To be contrasted with an END_STREAM frame that indicates an individual stream ended successfully.

https://www.rfc-editor.org/rfc/rfc7540.html

1 more reply

xyzzyz3y ago

> Like the Head-of-Line blocking problem: it is not solved in HTTP/2

How so? It is not solved on the network level (due to its use of TCP), but it is solved on application layer: slow response to one request is not blocking other requests.

1 more reply

AceJohnny23y ago

Offtopic, but:

> However, Google is not one single company, but a collection of independent and distrusting companies.

This is an important thing to keep in mind when considering the behavior of any large company.

teaearlgraycold3y ago

As a Googler, it's worse. Even within one of the "companies" there is distrust in the chain of command.

Tyr423y ago

Googler here. Yeah, it goes my boss, another guy I'm somewhat familiar with, then some cloud of VPs or something and then Sundar.

I have no idea what those vp's are up to and not much faith in their decisions.

1 more reply

criticaltinker3y ago

Relevant post from a few days ago:

Connect-Web: TypeScript library for calling RPC servers from web browsers

https://news.ycombinator.com/item?id=32345670

I’m curious if anyone knows how Google internally works around the lack of support for gRPC in the browser? Perhaps gRPC is not used for public APIs?

The lack of browser support in the protobuf and gRPC ecosystem was quite surprising and one of the biggest drawbacks noted by my team while evaluating various solutions.

skybrian3y ago

Back in the day, it wasn't used for private API's either. Different teams had come up with different ways of encoding protobuf-style messages as JSON for web apps.

For the best browser-side performance, usually you want to use browser's native JSON.parse() API call and this doesn't really let you use unmodified protobufs. In particular, you can't use 64-bit ints since that's not a native JavaScript type. Meanwhile, server-side folks will use 64-bit ints routinely. So if the server-side folks decided on 64-bit integer ID's, you need workarounds like encoding them as strings on the wire.

JavaScript has BigInt now, but still doesn't natively support decoding 64-bit integers from JSON.

It didn't seem like the gRPC folks understood the needs of web developers very well.

akshayshah3y ago

Is decoding performance typically a problem for web UIs? The lackluster performance of binary protobuf decoding in browsers (and unmarshaling BigInts from JSON) seems much less problematic than (1) using a 200 for unary error responses, (2) choosing a wire format that's _always_ opaque to the network inspector tab, and (3) having really poor generated code.

> It didn't seem like the gRPC folks understood the needs of web developers very well.

Agreed. Being fair to the team that designed the protocol, though, it seems like browsers weren't in scope at the time.

tombl3y ago

Looks like they have a separate protocol[0] for web compat, and they use a proxy to translate.

[0]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-WEB.md

kleton3y ago

Google internally doesn't have browser grpc clients. It's for service to service rpc and also exposed on googleapis.com apis to third party callers.

wmf3y ago

Is it still the case that Google Chrome can't support Google gRPC?

akshayshah3y ago

To be fair, it's also true that Firefox doesn't expose trailers to the fetch API. To the Chrome and gRPC team's credit, the public Chrome issue actually contains a somewhat substantive back-and-forth; the corresponding Firefox issue has virtually no discussion.

https://bugzilla.mozilla.org/show_bug.cgi?id=1339096

jxi3y ago

It requires a translation proxy: https://github.com/grpc/grpc-web

wonnage3y ago

Trailers would be theoretically useful in a variety of HTML streaming-related cases if they actually had widespread support (but they don't):

- sending down Server-Timing values for processing done after the headers are sent - updating the response status or redirecting after the headers are sent - deciding whether a response is cacheable after you've finished generating it

All of these except the first one obviously break assumptions about HTTP and I'm not surprised they're unsupported. Firefox [1] actually supports the first case. The rest have workarounds, you can do a meta-refresh or a JS redirect, and you could simply not stream cacheable pages (assuming they'd generally be served from cache anyway).

But it's still the case that frontend code generally likes to throw errors and trigger redirects in the course of rendering, rather than performing all that validation up front. That's sensible when you're rendering in a browser, but makes it hard to stream stuff with meaningful status codes.

kiriberty3y ago

Great article. I really like the points in "Lessons for Designers" section. Applicable for software engineering in general as well.

akshayshah3y ago

The author also posted an interesting Twitter thread a few months ago [0], on the day my coworkers and I posted here about our gRPC-compatible RPC framework [1]. I was a bit afraid to read this post, but I shouldn't have been - the author's a class act, and he never called us out explicitly. There's not much written about what the gRPC team was _thinking_ when they wrote up the protocol, and this was a nice window into how contemporaneous changes to HTTP and the fetch API shaped their approach. Given my current work, the final section ("Lessons for Designers") really hit home.

That said, I didn't follow the central argument - that you need HTTP trailers to detect incomplete protobuf messages. What's not mentioned in the blog post is that gRPC wraps every protobuf message in a 5-byte envelope, and the bulk of the envelope is devoted to specifying the length of the enclosed message. It's easy to detect prematurely terminated messages, because they don't contain the promised number of bytes. The author says, "[i]t’s not hard to imagine that trailers would be less of an issue, if the default encoding was JSON," because JSON objects are explicitly terminated by a closing } - but it seems to me that envelopes solve that problem neatly.

With incomplete message detection handled, we're left looking for some mechanism to detect streams that prematurely terminate at a message boundary. (This is more likely than you might expect, since servers often crash at message boundaries.) In practice, gRPC implementations already buffer responses to unary RPCs. It's therefore easy to use the standard HTTP Content-Length header for unary responses. This covers the vast majority of RPCs with a simple, uncontroversial approach. Streaming responses do need some trailer-like mechanism, but not to detect premature termination - as long as we're restricting ourselves to HTTP/2, cleanly terminated streams always end with a frame with the end of stream bit set. Streaming does need some trailer-like mechanism to send the details of any errors that occur mid-stream, but there's no need to use HTTP trailers. As the author hints, there's some unused space in the message envelope - we can use one bit to flag the last message in the stream and use it for the end-of-stream metadata. This is, more or less, what the gRPC-Web protocol is. (Admittedly, it's probably a bad idea to rely on _every_ HTTP server and proxy on the internet handling premature termination correctly. We need some sort of trailer-like construct anyways, and the fact that it also improves robustness is a nice extra benefit.)

So from the outside, it doesn't seem like trailers improve the robustness of most RPCs. Instead, it seems like the gRPC protocol prioritizes some abstract notion of cleanliness over simplicity in practice: by using the same wire protocol for unary and streaming RPCs, everyday request-response workloads take on all the complexity of streaming. Even for streaming responses, the practical difficulties of working with HTTP trailers have also been apparent for years; I'm shocked that more of the gRPC ecosystem hasn't followed .NET's lead and integrated gRPC-Web support into servers. (If I had to guess, it's difficult because many of Google's gRPC implementations include their own HTTP/2 transport - adding HTTP/1.1 support is a tremendous expansion in scope. Presumably the same applies to HTTP/3, once it's finalized.)

Again, though, I appreciated the inside look into the gRPC team's thinking. It takes courage to discuss the imperfections of your own work, especially when your former coworkers are still supporting the project. gRPC is far from perfect, but the engineers working on it are clearly skilled, experienced, and generally decent people. Hats off to the author - personally, I hope to someday write code influential enough that a retrospective makes the front page of HN :)

0: https://twitter.com/CarlMastrangelo/status/15322565762742435...

1: https://news.ycombinator.com/item?id=31584555

sgammon3y ago

this is a fantastic take, also, you’re right about server side gRPC web.

Java had an experimental implementation that was abandoned.

If Google were using gRPC web internally, typescript and Java support would be first class.

lakomen3y ago

To this day it's still not clear to me, as even if asked on their github issues there is no definite answer,

Can one use nginx in front of a grpc serving backend if the client is a JS client in the broadest sense?

This unanswered question is the main reason I'm still doing RESTful JSON.

AtNightWeCode3y ago

Very nice post. Http2 did not solve the TCP HOL problems though. Not sure about the WS statement. On the other hand. Vanilla WS has never ended up in prod on any of my projects even if it has been implemented several times.

jenia20223y ago

I'm not getting it. Why is HTTP so inadequate for gRPC?

A service app for example can open 1000 sockets with a server and simply multiplex that way.

jeffbee3y ago

Author doesn't support the case for grpc being a "failure". I wonder by what measure. It's certainly pretty popular.

plorkyeran3y ago

Being able to use gRPC in browsers was an explicit goal of the project, and that is impossible. gRPC-Web has to use a modified version of the protocol that has a limited feature set and performs worse for the reasons described in the article.

__alexs3y ago

The lack of trailers support is not what means gRPC-Web needs to exist. It would be trivial for gRPC to support trailers-in-body like gRPC-Web does as a work around. The main problem for gRPC in the browser is the lack of HTTP/2 framing for messages which means gRPC-Web has to invent it's own framing format to make streams work.

My experience in the early days of gRPC is that they seemed fairly unwilling to consider any need for an easy upgrade path for existing people using HTTP/1.1 at all.

The author touches on this at the end:

> Focus on customers. Despite locking horns with other orgs, our team had a more critical problem: we didn’t listen to early customer feedback.

I'm glad they realise it now because lots of us warned them about this at the time.

ikiris3y ago

Lets call the next version gRPC Send. then the vp can "leave" after a year or so, the project can get scrapped, and we can go back to something decent XD

quietbritishjim3y ago

I was going to post a similar comment, but looking back at the post I realised that the author is upfront about what they consider the original point of gRPC to be:

> gRPC was reared by two parents trying to solve similar problems:

> 1. The Stubby team. They had just begun the next iteration of their RPC system...

> 2. The API team. ... serving (all) public APIs at Google ... [this is not said explicitly but presumably the vast majority of API clients are web based]

As you say, gPRC is very popular at server messaging, but I suppose it can never be an API solution. So, even if gRPC is successful in general, it was not successful at its original goal (as far as this author is concerned).

kris-nova3y ago

gRPC: protobuf and stubby for performance reasons, we’ve spared no expense.

fijiaarone3y ago

Application layer encoding should not interfere in the protocol transport layer.

jayd163y ago

This is more about whether it is acceptable to push error checking to the application layer or not. Seems like the gRPC designers agreed with you and so they need trailers.

jxi3y ago

I was so excited for gRPC when it came out because it meant having strongly typed APIs and auto-generated clients, but two things made it horrible to use: requiring http/2 (so you couldn’t use most load balancers at the time) and the generated clients were unpleasant to use (you couldn’t just return an object to serialize, you had to conform to their streaming model).

mcfedr3y ago

Checkout Twirp, you get the good parts, protobufs and generated code, but its supports regular http, by no including streaming.

https://github.com/twitchtv/twirp

j / k navigate · click thread line to collapse

125 comments

twiss3y ago

[1]: https://github.com/whatwg/fetch/issues/772

modeless3y ago

twiss3y ago

Apparently, there is a new issue for it, so that might yet happen: [2].

[1]: https://github.com/whatwg/fetch/issues/34

[2]: https://github.com/whatwg/fetch/issues/981

2 more replies

guenthert3y ago

Jeebus. Just because it's not true doesn't mean it's a lie.

1 more reply

sneak3y ago

Why should the decisionmaking process for Chromium be “democratic” simply because it is open source?

Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

twiss3y ago

> Anyone who wants to pay can implement whatever they want in the codebase. That’s in a way as democratic as it gets: equality of opportunity [to invest money and time].

Brian_K_White3y ago

The democratic process is that anyone who wants to pay for the ad campaign can try to convince everyone else that it's a good spec everyone should adopt, not merely pay a developer to code it.

If everyone else is not convinced then it should not become a thing no matter how much one party with money wants it.

fijiaarone3y ago

I wanted my road to swoop up and down like a roller coaster across the gorge but a single structural engineer on the bridge team overruled my obvious benefit.

joe_guy3y ago

I had never heard of HTTP trailers. So FYI

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Tr...

lloeki3y ago

I suppose even fewer people have heard that Transfer-Encoding: chunked supports chunk extensions, which allows one to supply arbitrary metadata without trailers.

https://datatracker.ietf.org/doc/html/rfc2616#section-3.6.1

https://datatracker.ietf.org/doc/html/draft-lnageleisen-http...

As noted at the end of the draft, this could be generalized and extended to have additional capabilities such as in flight integrity checks, or whatever you can think of.

k__3y ago

Haven't headers and trailers been renamed in the recent past?

akshayshah3y ago

https://www.rfc-editor.org/rfc/rfc9110.html#header.fields

https://www.rfc-editor.org/rfc/rfc9110.html#trailer.fields

thamer3y ago

vidarh3y ago

remram3y ago

> As an aside, HTTP/2 is technically superior to WebSockets. HTTP/2 keeps the semantics of the web, while WS does not.

WTF is this? Those are different layer protocols. WebSocket can run on top of HTTP/2.

It's like saying TLS is technically superior to TCP, or IP is superior to copper cables.

Reference: https://www.rfc-editor.org/rfc/rfc8441.html

Matthias2473y ago

Websockets over http/2 are a new thing, and haven’t even been available at the time gRPC incepted.

the_mitsuhiko3y ago

As far as I know for HTTP/3 there is no way to use websockets yet.

stefan_3y ago

That is their whole point. That is why they exist. Dumb reliable pipe, please keep your silly semantics away.

1 more reply

rektide3y ago

For sure there are good reasons to abandon or find alternative resourceful protocols!

But in general, http is & could be the de-facto really good resourceful protocol. It's already 90% there.

jayd163y ago

Thorrez3y ago

While the individual messages can't be multiplexed, different websocket streams over a single HTTP/2 connection can be multiplexed.

I think websockets also provides a feature that HTTP/2 doesn't: the ability to easily push data from the server to browser javascript.

1 more reply

funny_falcon3y ago

TCP can't multiplex. HTTP/2 runs over TCP and does multiplexing.

WebSocket can't multiplex. Nothing prevents gRPC over WebSocket implement multiplexing itself.

2 more replies

remram3y ago

Telephony (websockets) runs over the copper cables or fiber optics (HTTP/1 or HTTP/2).

anyfoo3y ago

But I only skimmed the introduction, did I miss something?

blibble3y ago

normally uses GET and the Upgrade header, not CONNECT

1 more reply

kevinmgranger3y ago

jjtheblunt3y ago

> WebSocket can run on top of HTTP/2

wmf3y ago

No, despite the name WebSockets are not plain TCP.

1 more reply

aaaaaaaaaaab3y ago

TCP is a stream of bytes. WebSockets are message-based.

chucky_z3y ago

thayne3y ago

mort963y ago

It's my belief that requiring TLS for HTTP/2 is what killed the protocol. It just causes too much friction during both development and deployment, for little to no (or negative) performance gain.

2 more replies

IshKebab3y ago

Thrift has a much more sane design where everything is pluggable including the transport layer.

Bit of a shame that Thrift never became more popular.

arriu3y ago

Yup, and this is also why many end up proxy gRPC over HTTP/1.1 after giving up on making HTTP/2 work with systems that don’t support it…

Matthias2473y ago

afc3y ago

He offers two facts that I think explain this well enough:

> Additionally, they chose to keep Protobuf as the default wire format, but allow other encodings too.

And:

> Since streaming is a primary feature of gRPC, we often will not know the length of the response ahead of time.

Thorrez3y ago

This doesn't match what I see in the gRPC spec. It says every message must be length-prefixed.

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2....

Disclaimer: I don't know much about gRPC.

1 more reply

zimpenfish3y ago

> It is obvious for the peer if a complete message (inside a stream or single response) is received

[1] https://developers.google.com/protocol-buffers/docs/encoding...

Thorrez3y ago

Yes, the protobuf format makes the end ambiguous, meaning the end needs to be indicated by the protocol containing the protobuf.

https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2....

Disclaimer: I don't know much about gRPC.

1 more reply

alexcpn3y ago

marcyb5st3y ago

withinboredom3y ago

ackfoobar3y ago

I suppose you mean "inspect" the data on-the-wire

https://grpc.io/blog/wireshark/

Wireshark can load proto files and decode the data for you.

BTW, "The Internet is running in debug mode".

rswail3y ago

Personal opinion: RPC is a failed architectural style, independent of what serialization/marshalling of arguments is used. it failed with CORBA, it failed with ONC-RPC, it failed with Java RMI.

REST relies on the media type, links and the limited verb set to define the resource and the state transfer operations to change the state of the resource.

HTTP explicitly incorporates the networked nature of the server/client relationship, independent of, and irrespective of, the underlying server or client implementation.

Media types, separated from the HTTP networking, define the format and serialization of the resource representation independent of the network.

HTTP/REST doesn't really support streaming.

akshayshah3y ago

That's true of CORBA, for sure. I'm not familiar with ONC-RPC or Java RMI.

0: http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom...

rswail3y ago

The generated client from an IDL that wraps the network protocol with a function call is also part of the problem.

1 more reply

atombender3y ago

CORBA and RMI are quite different from gRPC. (I have not used ONC-RPC.)

rswail3y ago

The difference is during the design phase, where you focus on those resources and their state, instead of the process for changing that state.

jiggawatts3y ago

Something that’s always bugged me about streaming protocols of this type is that they prevent processing pipelining.

If the data is sent chunked, then this is not an issue. The client can start decoding chunks as they arrive, each one with a separate checksum.

jmillikin3y ago

This comment is mixing up a few concerns.

[0] gRPC supports running without encryption for local development, but obviously for real workloads you'd do end-to-end TLS.

[2] https://developers.google.com/protocol-buffers/docs/encoding

crest3y ago

throwaway293033y ago

Interesting read.

  As an aside, HTTP/2 is technically superior to WebSockets. HTTP/2 keeps the semantics of the web, while WS does not. Additionally, WebSockets suffers from the same head-of-line blocking problem HTTP/1.1 does.

Not really a fair comparison. WebSockets is essentially a bidirectional stream of bytes without any verbs[0] or anything fancy. WebSockets is more like a fancy CONNECT.

[0] - unless you count its opcode field as something similar to HTTP verbs but if so it'd resemble more TCP than HTTP, I think

[1] - https://datatracker.ietf.org/doc/html/draft-ietf-quic-multip...

mdriley3y ago

It seems like a lot of other technologies in this space have solved the listed problems while remaining compatible with browsers, load balancers, reverse proxies, etc.

It was a product choice not to offer a fallback path when HTTP/2 was unavailable. That choice made gRPC impossible to deploy in a lot of real-world environments.

What motivated that choice?

lakomen3y ago

Can you name one or more?

mdriley3y ago

one relevant datapoint: https://devblogs.microsoft.com/dotnet/announcing-grpc-json-t...

game-of-throws3y ago

> Why Do We Need Trailers At All?

adev_3y ago

It did not to me.

I would rephrase the argumentation as:

- In HTTP/2 we thought to be smart by multiplexing multiple HTTP transactions over a single TCP connection.

- That is currently a good example of good intentions inducing complexity. And complexity inducing even more complexity for free.

- HTTP/2 is now rolled over the world and everybody has to deal with that

Thorrez3y ago

>Shit we realized later that HTTP/1.1 did not necessitate trailers because they could abort the connection and we can not afford to do that anymore

https://www.rfc-editor.org/rfc/rfc7540.html

1 more reply

xyzzyz3y ago

> Like the Head-of-Line blocking problem: it is not solved in HTTP/2

How so? It is not solved on the network level (due to its use of TCP), but it is solved on application layer: slow response to one request is not blocking other requests.

1 more reply

AceJohnny23y ago

Offtopic, but:

> However, Google is not one single company, but a collection of independent and distrusting companies.

This is an important thing to keep in mind when considering the behavior of any large company.

teaearlgraycold3y ago

As a Googler, it's worse. Even within one of the "companies" there is distrust in the chain of command.

Tyr423y ago

Googler here. Yeah, it goes my boss, another guy I'm somewhat familiar with, then some cloud of VPs or something and then Sundar.

I have no idea what those vp's are up to and not much faith in their decisions.

1 more reply

criticaltinker3y ago

Relevant post from a few days ago:

Connect-Web: TypeScript library for calling RPC servers from web browsers

https://news.ycombinator.com/item?id=32345670

I’m curious if anyone knows how Google internally works around the lack of support for gRPC in the browser? Perhaps gRPC is not used for public APIs?

The lack of browser support in the protobuf and gRPC ecosystem was quite surprising and one of the biggest drawbacks noted by my team while evaluating various solutions.

skybrian3y ago

Back in the day, it wasn't used for private API's either. Different teams had come up with different ways of encoding protobuf-style messages as JSON for web apps.

JavaScript has BigInt now, but still doesn't natively support decoding 64-bit integers from JSON.

It didn't seem like the gRPC folks understood the needs of web developers very well.

akshayshah3y ago

> It didn't seem like the gRPC folks understood the needs of web developers very well.

Agreed. Being fair to the team that designed the protocol, though, it seems like browsers weren't in scope at the time.

tombl3y ago

Looks like they have a separate protocol[0] for web compat, and they use a proxy to translate.

[0]: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-WEB.md

kleton3y ago

Google internally doesn't have browser grpc clients. It's for service to service rpc and also exposed on googleapis.com apis to third party callers.

wmf3y ago

Is it still the case that Google Chrome can't support Google gRPC?

akshayshah3y ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1339096

jxi3y ago

It requires a translation proxy: https://github.com/grpc/grpc-web

wonnage3y ago

Trailers would be theoretically useful in a variety of HTML streaming-related cases if they actually had widespread support (but they don't):

kiriberty3y ago

Great article. I really like the points in "Lessons for Designers" section. Applicable for software engineering in general as well.

akshayshah3y ago

0: https://twitter.com/CarlMastrangelo/status/15322565762742435...

1: https://news.ycombinator.com/item?id=31584555

sgammon3y ago

this is a fantastic take, also, you’re right about server side gRPC web.

Java had an experimental implementation that was abandoned.

If Google were using gRPC web internally, typescript and Java support would be first class.

lakomen3y ago

To this day it's still not clear to me, as even if asked on their github issues there is no definite answer,

Can one use nginx in front of a grpc serving backend if the client is a JS client in the broadest sense?

This unanswered question is the main reason I'm still doing RESTful JSON.

AtNightWeCode3y ago

jenia20223y ago

I'm not getting it. Why is HTTP so inadequate for gRPC?

A service app for example can open 1000 sockets with a server and simply multiplex that way.

jeffbee3y ago

Author doesn't support the case for grpc being a "failure". I wonder by what measure. It's certainly pretty popular.

plorkyeran3y ago

__alexs3y ago

My experience in the early days of gRPC is that they seemed fairly unwilling to consider any need for an easy upgrade path for existing people using HTTP/1.1 at all.

The author touches on this at the end:

> Focus on customers. Despite locking horns with other orgs, our team had a more critical problem: we didn’t listen to early customer feedback.

I'm glad they realise it now because lots of us warned them about this at the time.

ikiris3y ago

Lets call the next version gRPC Send. then the vp can "leave" after a year or so, the project can get scrapped, and we can go back to something decent XD

quietbritishjim3y ago

I was going to post a similar comment, but looking back at the post I realised that the author is upfront about what they consider the original point of gRPC to be:

> gRPC was reared by two parents trying to solve similar problems:

> 1. The Stubby team. They had just begun the next iteration of their RPC system...

> 2. The API team. ... serving (all) public APIs at Google ... [this is not said explicitly but presumably the vast majority of API clients are web based]

kris-nova3y ago

gRPC: protobuf and stubby for performance reasons, we’ve spared no expense.

fijiaarone3y ago

Application layer encoding should not interfere in the protocol transport layer.

jayd163y ago

This is more about whether it is acceptable to push error checking to the application layer or not. Seems like the gRPC designers agreed with you and so they need trailers.

jxi3y ago

mcfedr3y ago

Checkout Twirp, you get the good parts, protobufs and generated code, but its supports regular http, by no including streaming.

https://github.com/twitchtv/twirp

j / k navigate · click thread line to collapse