The internet has become way more important than back when these protocols first became standard, and every time a protocol or standard is up for debate, political and commercial forces try to influence it in their favor. Some of the concepts they tried to shove into IPv6 were downright evil, and would have killed the internet as we know it. Personally, I'm relieved all that is left is a small, un-sexy improvement which albeit slowly, will eventually spread and solve the only really critical problem we have with IPv4.
I really dread subjecting HTTP to that process. Although I fully agree with the author's critique of cookies for instance, the idea of replacing them with something "better" frankly scares the crap out of me. Especially when the word "identity" is being used. You just know what kind of suggestions some powerful parties will come up with if you open this up for debate, and fighting that will take up all of the energy that should be put towards improving what we already have.
As techies we should learn to accept design flaws and slow adoption and look at the bigger picture of the social and political impact of technology: HTTP may be flawed, but things could be way, way worse.
After all, is any of his technical advice invalid due to political concerns that are not wild speculation on your part?
----
Not to say that politics doesn't enter into it, just that it should be brought to the table and discussed by other actors. And those actors should probably be all ears about the technical issues.
How so? You seem to be distinguishing between politics actors (politicians?) and technical actors.
In a democracy it is not just important but essential that ALL have their say on policy, not just "political experts".
He throws out a lot of criticism about SPDY being haphazardly designed (with no explanation), then we find out that really he has an axe to grind over cookies and SSL.
I call bullshit on the whole post. I found nothing useful in it. I almost fell for the http router bit, but again he offers no more than vague criticisms. If SPDY hasn't been a problem at Google and Facebook for load balancers, SPDY isn't badly designed for load balancer implementation. It leads me to believe that his real issue is that Varnish must have been coded in such a way to make it hard to support SPDY. Or perhaps that the authors real beef with SPDY is that he didn't design it.
http://lists.w3.org/Archives/Public/ietf-http-wg/2012JulSep/...
Oh? Got an example? I've never heard of this (but don't really follow IPv6 stuff).
This is currently sometimes done by cookies, which makes life difficult for HTTP routers. He is proposing a mechanism to keep the identifying-part while getting rid of problems in the HTTP router layer. The way I read this, it seemed to be without introducing additional privacy concerns and in fact removing some. (Cookies can carry more than identity)
There is a detailed technical discussion to be had about implementing all of this, and in this discussion any privacy concerns would become visible and open for discussion. But I think it is a leap to say that the comments in TFA would necessarily make for a world bereft of privacy ;)
Ever heard of evercookie? Does that not scare you? Would creating a clean, well-understood solutions that users can actually control not be better than what we have now?
There is just so much wrong with cookies, it's really surprising that no HTTP upgrades propose anything better. For one, cookies confuse session information and client-side storage, and thus work poorly in both roles.
> Especially when the word "identity" is being used. You just know what kind of suggestions some powerful parties will come up with if you open this up for debate, and fighting that will take up all of the energy that should be put towards improving what we already have.
Oh wow, I hadn't thought of that. Reading that critique I was just thinking "oooh doing away with cookies would be a great thing", slightly wondering what one could replace it with ... but you're right, they'd probably replace it with something extra plus plus scary.
The problem is that cookies are in you computer, they should be ephemeral (you can do it, but is not standard).
But then, yes, Facebook and Google and even Governments will try to know everything about you.
It comes with the caveats that "Please disregard any strangeness in the boilerplate, I may not thrown all the right spells at xml2rfc, and also note that I have subsequently changed my mind on certain subjects, most notably Cookies which should simply be exterminated from HTTP/2.0, and replaced with a stable session/identity concept which does not make it possible or necessary for servers to store data on the clients."
What do you think is more likely going to be adopted? A protocol that's not backwards compatible at all (heck, it even throws out cookies) or something that works over the existing protocol, negotiating extended support and then switching to that while continuing to work the exact same way for both old clients and the applications running behind it?
See SPDY which is a candidate for becoming HTTP 2.0. People are ALREADY running that or at least eager to try it. I don't think for a second that SPDY is having the adoption problems of ipv6, SNI issues aside.
Even if native sessions would be a cool feature, how many years do you believe it takes before something like that can be reliably used? We're still wary of supporting stuff that a 11 years old browser didn't support.
Google being able to modify both the client (Chrome) as well as few fairly significant server installations has kind of helped there a little bit...
What SNI issues? In practice any client that supports SPDY is going to support SNI.
So you either only provide SSL+SPDY for browsers you know support SNI, or you don't provide either SSL or SPDY to all of the browsers.
1. Working code. 2. Publicity. 3. Ubiquity.
That's it. Kamp is making a lot of the right noises here, but he's already lost ground to SPDY just because they've shipped code. No amount of sitting round tables bashing out the finer details of a better spec will help as much as getting code written - even if it's just a placeholder for an extensible spec, as long as that placeholder does something useful.
Is SPDY good enough now and fixable enough in the future to leverage this gained ground?
As more and more people deploy SPDY, they will understand its problems and there will almost no chances to change it, differently from what Google did at the beginning (they have gone through 1 big change and many small changes in the protocol). When the authors of the main servers (varnish, apache, nginx) will start feeling its limits, will they have to keep it around just for sake of compatibility?
Please note that SPDY on the server is not a requirement, just an opportunity. Nothing will change for your users if you want to remove support for SPDY in your server after you deployed it and used it for some time. There are no "http+spdy://example.org" URL around, so supporting HTTP only will always be sufficient. Maybe not as much performant as SPDY but 100% supported.
While I can completely agree with the technical merits of this proposal, there are some very two-faced statements.
Author begins by pointing out the painfulness of IPv4 to IPv6, says that the next HTTP upgrade should be humble. But then proceeds to kill cookies and remove all the architectural problems in HTTP. Isn't that the same what IPv6 was? Wouldn't such an approach produce the same amount of pain to the implementors (that is us, the web developers)?
Any upgrade will certainly have some backward-incompatible changes. But if it is totally backward incompatible, I don't understand why it still needs to be called HTTP. Couldn't we just call it SPDY v2 instead, or some other fancy name?
Cookies are a problem. But the safest way to solve that problem is in isolation. Try to come up with some separate protocol extension, see if it works out, throw it away if it doesn't. But why marry the entire future of HTTP with such a do-or-die change?
I blindly agree with the author that SPDY is architecturally flawed. But why is it being advocated in such big numbers? Even Facebook (deeply at war with Google) is embracing it. It's because SPDY doesn't break existing applications. Just install mod_spdy to get started. But removing cookies? What happens to the millions of web apps deployed today, which have $COOKIE and set_cookie statements everywhere in the code? How do I branch them out and serve separate versions of the same application, one for HTTP/1.1 and another for HTTP/2.0?
More doubts keep coming... Problem with SPDY compressing HTTP headers? Use SPDY only for communication over the internet. Within the server's data center, or within the client's organization - keep serving normal HTTP. There are no bandwidth problems within there. Just make Varnish and the target server speak via SPDY, that is where the real gains are.
I could go on. I'm not trying to say that the author's suggestions are wrong. They are important and technically good. But the way they should be taken up and implemented, without pain to us developers, doesn't have to be HTTP/2.0. Good ideas don't need to be forced down others throats.
One example was multihoming (having more than one ISP) serveral smart proposals were floated (anycast, nearcast etc) but they were killed by ISP's who protected a lucrative business.
If Ipv6 had made multi-ISP multihoming possible without all the trouble of BGP, business would have killed to get it back in the late 1990ies.
Cookies only disappear from the wire, they are trivial to simulate on your server (see my other reply here).
Yeah, I used to think that, then I participated in some IPv6 conversions and watched some others. I don't think that any more. IPv6 may not be the Glorious Solution to All Network Problems Ever, but it's not just the obvious incremental improvement on IPv4 either. It's a new protocol.
(I do sometimes wonder if an IPv4.1 that simply set a flag and used 8 bytes instead of 4 was proposed right now if it could still beat IPv6 out to the field even with IPv6's head start. Note, I'm not saying this would necessarily be a good idea, I just find myself wondering if IPv4.1 could still hypothetically beat IPv6 to deployment.)
That's not good enough. The big problem is that you need the headers to properly route the request to the correct server. So for most operations, there will have to be one machine that is capable of reading all the headers of all the requests that arrive. gzipping the headers makes the job of this machine much, much harder.
I disagree with the stab he takes at cookie-sessions here, though. He seems to ignore that sessions are not only about identity but also about state.
Servers should be stateless, therefor client-sessions (crypt+signed) are usually preferable over server-sessions.
Having a few more bytes of cookie-payload is normally an order of magnitude cheaper (in terms of latency) than performing the respective lookups server-side for every request. Very low bandwidth links might disagree, but that's a corner-case and with cookies we always have the choice.
Removing cookies in favor of a "client-id" would effectively remove the session-pattern that has proven optimal for the vast majority of websites.
Servers storing stuff on the clients is just plain wrong, and it is wrong from every single angle you can view it: Its wrong from a privacy point of view, it's wrong from a cost-allocation point of view, it's wrong from an architecture point of view and it's wrong from a protocol point of view.
But it was a quick hack to add to HTTP in a hurry back in the dotcom days.
It must die now.
But this does not imply that the clients should store no state - it only implies that the state as perceived by the users needs to be the same. Different from how it is implemented.
While we are at the topic of state, why do I have to subscribe to a service just to be able to add the bookmark on one device and use it on the other ?
I view the two problems as congruent (except the bookmarks state is global, thus there is no "server" to offload the state onto) - but at the same time this difference highlights the assumption that there is "The Server" for the web app. What if there weren't ? Can we push the model a bit further and make it p2p - and I am pretty sure that as the homomorphic crypto advances, we will be able to do so even for the untrusted peers. Then there's no "server" anymore to store the state in.
Then, you have the DoS bit. Absolutely correctly the HTTP routers are the most loaded and hard to scale element of the whole setup. If you offload the state on the client, then you can "dumb down" the task of the non-initial content switching decision, based on the trustable client state.
So, I think that distributing the state is a good idea. What is limiting is the naive distributing the state - and this is where I agree with your assessment. And that's probably one of the things that would need to get fixed for something that would is big enough to be called "2.0". (As a by-product, solving the above would also solve the endpoint identity/address change survivability problem).
That's the opposite of the general consensus in the webdev-community.
Client-state is not only vastly more efficient in many cases but it also usually leads to cleaner designs and easier scaling.
Many of the modern desktop-like webapps would be outright infeasible without client-state. What's your response to that, should we just refrain from making such apps in the browser?
If I add something to my shopping basket from my mobile phone, I want to be able to add more from my browser
And at the same time you probably appreciate when on your slow mobile-link the "add-basket" operation happens asynchronously, yet doesn't get lost when you refresh the page at the wrong moment.
I'm a bit confused here. You know better than most how critical latency is to the user-experience. Saving on server-roundtrips or hiding them is a big deal.
Yet you promote this dogma without providing an alternative solution to this dilemma.
What's your opinion on IndexedDB and other local storage mechanisms? I believe that single-page-apps are overused, but I do think that they have their niche and standards for storing data locally are valuable and necessary. In my own work I'd use that space as a cache rather than permanent storage, just like I'd use something like memcached on the server side to reduce database queries.
Also, consider dabblet. The way it allows you to store your stuff using github is very smart IMHO.
Not that it is surprising given the source, but this "my opinion is objectively correct" nonsense isn't constructive. Client side sessions give you stateless servers, which allows real seamless fail-over. Having to run a HA session-storage service to get that is a big additional cost. "PHK said it is right" doesn't provide sufficient benefits to overcome that downside.
> Or I might add, how HTTP replaced GOPHER[3].
telnet and gopher were used by a few thousands servers only and were not consumer facing technologies (for the most part), it doesn't make sense to compare that to IPv4 and HTTP that are used by millions (billion?) of servers.
The card catalogs at most university libraries and most libraries of any national or international importance were reachable by telnet in 1992. And I think card catalogs count as a "consumer-facing" service.
The vast majority of internet client software in 1992 was text only. The first exception to this that I am aware of is the WWW, which most internet users had not started to use by the end of 1992 (email, newsgroups and ftp being the most widely used services). The way most connected to the internet or an intranet from home was by sending vt100 or similar protocol over a dial-up link -- with a Unix shell account or VMS account at the other end of the link. Repeating myself for emphasis: in 1992 most people accessing the internet from home or from a small office used a modem and IP packets did not pass over that modem link. The point of this long paragraph is that the vast majority of the machines on which these shell accounts ran were also reachable by telnet.
Finally, the telnet protocol in 1992 was a "general-purpose adapter" similar to how HTTP is one today. For example, the first web page I ever visited I visited through a telnet-HTTP gateway so that I could get a taste of the new WWW thing without having to install a WWW browser. Note that this telnet-HTTP gateway is another example of a "consumer-facing" telnet server.
In summary, there were probably more than a few thousand telnet servers in 1992 -- and many of them were "consumer-facing".
I am almost certain there were a few million users (certainly so if we include college students who used it for a semester or so, then stopped) of the internet in 1992, and most of those users used telnet.
I can't speech for Gopher, but I routinely see telnet and rsh all around the industry, where anyone with control of some Windows machine on the network can sniff critical PLC and server passwords. Even when SSH is available for the servers. It is hopefully a changing situation as servers get replaced/upgraded and SSH gets more and more pervasive.
In 20 years' time, I'm pretty sure that the next iteration of us will be saying something like "When they phased out HTTP, there were only a few billion servers..."
Better will always make its way in, even if there are entrenched systems running the 'old faithful' code already out there. IE6 is being phased out, yes, VERY slowly, but we're already way closer to getting it down to an irrelevant number than we would be if there weren't pushes. Will we ever get rid of it completely? Maybe not. I'm sure there's still a gopher server out there somewhere or another, and it's not that uncommon to get Telnet access to some commodity (crappy) web hosts, but SSH is pervasive and good, and we're all better off for it.
Governments and home grown enterprise apps are my guess about who's late to the party.
> Cookies are, as the EU commision correctly noted, fundamentally flawed, because they store potentially sensitive information on whatever computer the user happens to use, and as a result of various abuses and incompetences, EU felt compelled to legislate a "notice and announce" policy for HTTP-cookies.
> But it doesn't stop there: The information stored in cookies have potentialiiy very high value for the HTTP server, and because the server has no control over the integrity of the storage, we are now seing cookies being crypto-signed, to prevent forgeries.
Anyone with a grain of skill is capable of using cookies as identifiers only; it's hard to see what cookies vs identifiers has to do with "notice and announce" or security. An explicit session mechanism could provide benefits over using cookies for the same purpose, but what exactly would removing cookies achieve other than breaking the world?
Unfortunately such people are evidently few and far between.
Banning cookies and having the client offer a session identifier instead solves many problems.
For starters, it stores the data where it belongs: On the server, putting the cost of storage and protection where it belongs too.
This is a win for privacy, as you will know if you have ever taken the time to actually examine the cookies on your own machine.
Second, it allows the client+user to decide if it will issue anonymous (ie: ever-changing) session identifiers, as a public PC in a library should do, or issue a stable user-specific session-id, to get the convenience of being recognized by the server without constant re-authorization.
Today users don't have that choice, since they have no realistic way of knowing which cookies belongs to a particular website due to 3rd-party cookies and image-domain splitting etc.
Network-wise, we eliminate a lot of bytes to send and receive.
One of the major improvements SPDY has shown is getting the entire request into one packet (by deflating all the headers).
But the only reason HTTP requests don't fit in a single packet to begin with is cookies, get rid of cookies, and almost all requests fit inside the first MTU.
Finally, eliminating cookies improve caching opportunities, which will help both client and server side get a better web experience.
As for breaking the world: It won't happen.
It is trivial to write a module for apache which simulates cookies for old HTTP/1 web-apps: Simply store/look up the cookies in a local database table, indexed by the session-id the client provided.
I'm sure sysadmins will have concerns about the size of that table, but that is an improvement, today the cost is borne by the web-users.
Most of the cookies I've seen are some kind of hash.
> Second, it allows the client+user to decide if it will issue anonymous (ie: ever-changing) session identifiers, as a public PC in a library should do, or issue a stable user-specific session-id, to get the convenience of being recognized by the server without constant re-authorization.
> Today users don't have that choice, since they have no realistic way of knowing which cookies belongs to a particular website due to 3rd-party cookies and image-domain splitting etc.
I don't see how this makes sense - what's the difference?
Assuming that the session identifier is different between sites (if it's not, then the user has no option to "remove cookies" for a single domain without deauthenticating everywhere, and it's harder to determine which sites are tracking you):
- There will still be third party domains involved, since advertisers will still want to correlate traffic between domains;
- Sending a new session identifier with every request won't be practical, because you won't be able to log in, but users will be able to set their browsers to send a new identifier when the window is closed or whatever... just as they could currently configure their browser to clear cookies at that time.
Also, anyone who wants to abuse cookies can just use localStorage.
> But the only reason HTTP requests don't fit in a single packet to begin with is cookies, get rid of cookies, and almost all requests fit inside the first MTU.
Surely it's still useful to deflate things (user-agent...), though, and then what does it matter?
> Finally, eliminating cookies improve caching opportunities, which will help both client and server side get a better web experience.
How so? The server is perfectly justified in sending different content based on the session identifier, so wouldn't a proxy have to assume it would?
But if you want to say the result doesn't depend on cookies, can't you just set a Vary header?
> It is trivial to write a module for apache which simulates cookies for old HTTP/1 web-apps: Simply store/look up the cookies in a local database table, indexed by the session-id the client provided.
Eh... okay. This still breaks anything that uses JavaScript to interact with the cookies.
(I work on mod_pagespeed, and our experimental framework uses cookies this way: https://developers.google.com/speed/docs/mod_pagespeed/modul...)
Cookies suck, from a technical and regulatory-compliance standpoint. Plus, I'll finally stop having to clear my cookies every month or so just to log in to my PayPal and American Express accounts. Both sites keep creating unique cookies on every login until there are so many that they pass their own web servers' max header length limits.
Because the only benefit of removing cookies is a tiny bit of simplicity which could theoretically allow removing (a small amount of) code browsers will already have to keep around for probably at least a decade to support existing websites. If cookies are mostly unused by the time HTTP/3.x rolls around, we can talk...
> Cookies suck, from a technical
Agreed, but...
> and regulatory-compliance standpoint.
I don't understand this point. Surely the need for regulation of user tracking by websites doesn't depend on whether cookies or an equivalent mechanism are being used? If people start using Not Cookies(tm), they will be unregulated at first, but the law will be changed if the effect is the same.
Edit: Similarly, any protocol that gives a website a persistent identity token without its explicitly requesting one is a bad idea - cookies do provide a modicum of visibility to the user regarding who's tracking them. Not sure exactly what Kamp is proposing.
> Plus, I'll finally stop having to clear my cookies every month or so just to log in to my PayPal and American Express accounts. Both sites keep creating unique cookies on every login until there are so many that they pass their own web servers' max header length limits.
Hah, no you won't. I strongly suspect legacy codebases will remain on HTTP/1.1 approximately forever, at least if 2.0 is backwards incompatible.
The initial line will remain the same, except for the version:
GET /page HTTP/2.0
*** extra 2.0 headers/request ***
If the server speaks 2.0, it will just carry on. If it doesn't, the server will return a 505 and the client will resubmit the request: GET /page HTTP/2.0
505 HTTP Version Not Supported
GET /page HTTP/1.1
*** 1.1 headers / request ***
There is no reason the protocols must to be backwards compatible past the first line. Hell, 2.0 could even be binary after that first line. So, while they don't have to be compatible, they can still coexist.For something as fundamental as HTTP the author argues changes need to be radical to drive adoption, but at the same time there's not necessarily wide-spread impetus to do so if the burden is too high. This is engineering on a 15 year time-scale, which I feel a little young (at 24) to well comprehend!
It's not just apps in the web-app sense, but user agents (all the way down to embedded systems) that would need changing to take advantage of the 'sufficient benefits'. That's a pretty massive undertaking.
What actually is the proposal to eliminate cookies? Just provide some fixed "identifier" type field?
Heh, I don't think any reasonable web apps actually depend on the value of those :)
> What actually is the proposal to eliminate cookies? Just provide some fixed "identifier" type field?
Unfortunately, I don't think there is a concrete proposal to compare to, other than
Given how almost universal the "session" concept on the Internet we
should add it to the HTTP/2.0 standard, and make it available for
HTTP routers to use as a "flow-label" for routing.Second, there are perfectly valid legally mandated circumstances which forbid end-to-end privacy, from children in schools to inmates in jail and patients in psych. hospitals, not to mention corporate firewalls and the monster that looks out for classified docs not leaking out of CIA.
(HTTP 1 would be sufficient for these use cases for a long time to come, but resource considerations should factor into these standards discussions)
Sure all that stuff has become semi-standard as it currently exists, but it is ugly, hacky, and sometimes doesn't work, and other times opens doors for hilarious malfeasance.
This is an argument for websockets eating protocol lunch.
I wonder what the ideal web protocol would look like. If, for example, we didn't have a burden of billions of servers and Internet reliance on HTTP/1.x protocol.
What would be the most ideal solution that would suite emerging use cases for the Web? Are there any research papers on this topic?
if your goal is to only route HTTP requests, then you're only solving the first step of an increasingly complicated field of computer science (namely, web applications).
Cookies aren't going to go away. if you want to improve the protocol to deal with cookies better, that makes sense, but acting like they are some kind of evil on the internet that should be forgotten isn't going to work. it's a bit self-defeating to argue that some protocols failed because of failure to provide new benefits and then argue against Cookies in HTTP!
I think Poul-Henning Kamp is fairly well qualified to discuss what load balancers do.
(And yes, load balancer are fundamentally HTTP Routers. Yes, sometimes they do content manipulation etc, but all those features are add-ons to the basic use-case)
The proxy developers have always had their doubts about SPDY (you can see them when @mnot first proposed it as a starting point)
Terminating SPDY at the HTTP Router makes a lot of sense architecturally but I know some orgs don't terminate SSL at the load-balancer due to the licensing costs.
Ultimately we need load-balancing options and someone to develop the opensource proxies (haproxy, varnish etc.) into more sophisticated offerings.
Perhaps they'll be a SPDY module for Traffic Server
If you wanna store stuff, there's HTML5. Cookies are really just tracking and session.
As for the router, what he says makes complete sense, but, if there is more to it, then, what are you thinking about? Personally I think the host header is the most important thing to parse, for, well, routing, termination, etc. I'm not certain what else is needed beyond that point.
It's not about killing cookies, it's about finding a better solution that solves the problem better.
Note: I work for a startup that "benefits financially" from tracking users—a feature without which we would not have a business.