I think this is sort of ignoring the whole point of the proposal. By making sites request this information rather than simply always sending it like the User-Agent header currently does, browsers gain the ability to deny excessively intrusive requests when they occur.
That is to say, "sites routinely request all of it" is precisely the problem this proposal is intended to solve.
There are some good points in this post about things which can be improved with specific Sec-CH-UA headers, but the overall position seems to be based on a failed understanding of the purpose of client hints.
But Set-Cookie kind of proves what happen to that kind of feature. If at first sites gets used to be able to request it and get it, then the browsers that deny anything will simply be ignored. And then those browsers will start providing everything, because they don't want to be left out in the cold.
That's what happened to User-Agent, that's what happened to Set-Cookie, and I can't see why it won't happen to Sec-CH-UA-*. Which the post hints at several times. Set-Cookie was supposed to have the browser ask the user to confirm whether they wanted to set a cookie. Not many clients doing that today.
To be honest, I feel the proposal is a bit naïve if it thinks that websites and all browsers will suddenly be on their best behaviour.
No worries, that's why we have laws to make the website do in the content what the browser no longer wants to do in the viewer. ;D
1. Move entropy from "you get it by default" to "you have to ask for it".
2. Add new APIs that allow you to do things that previously exposed a lot of entropy in a more private way.
3. Add a budget for the total amount of entropy a site is allowed to get for a user, preventing identifying users across sites through fingerprinting.
Client hints are part of step #1. Not especially useful on its own, but when later combined with #3 sites now have a strong incentive to reduce what they ask for to just what they need.
(Disclosure: I work on ads at Google, speaking only for myself)
From the two non-harmful pieces, one is of interest of all sites, and the other one has the implementation broken on Chrome, so sites will have to use an alternative mechanism anyway. If there's any value on the idea, Google can propose them with a set of information that brings value, instead of just fingerprinting people.
So, no, it should be rejected. Entirely and severely. It doesn't mean that contextual headers are a bad practice, it's just that this one proposal is bad.
I'd love to see browser metrics being absolutely devastated as an analytic source: It just is used today as an excuse to only support Chrome.
Browsers can just not send a UA header
If User Agent Client Hints become the new normal, I'm sure anyone excessively denying requests will be flagged in the same way.
However it is nice that there's now a separate header that gives a yes or no answer on whether it's a mobile device.
- The Mobile vs Desktop design differences are too great.
- The site was originally created without considering mobile, and retrofitting mobile support is unfeasible.
This is also true with respect to SNI which leaks the domain name in clear text on the wire. The popular browsers send it even when it is not required.
The forward proxy configuration I wrote distinguishes the sites (CDNs) that actually need SNI and the proxy only sends it when required. The majority of websites submitted to HN do not need it. I also require TLSv1.3 and strip out unecessary headers. It all works flawlessly with very few exceptions.
We could argue that sending so much unecessary information as popular browsers do when technically it is not necessary for the user is user hostile. It is one-sided. "Tech" companies and others interested in online advertising have been using this data to their advantage for decades.
SNI is sent by the client in the initial part of the TLS handshake. If you don't send it, the server sends the wrong/bad cert. The client could retry the handshake using SNI to get the correct cert but:
- This adds an extra RTT, on the critical path of getting the base HTML, hurting performance.
- A MITM could send back an invalid cert, causing the browser to retry with SNI, leaking it anyway (since we aren't talking about TLS 1.3 and an encrypted SNI).
I suppose the client could maintain a list of sites that don't need SNI, like the HSTS preload list, but that seems like a ton of overhead to avoid sending unneeded SNI, especially when most DNS is unencrypted and would leak the hostname just like SNI anyways.
That list would be much larger than the list of sites that do require SNI.
Generally, I can determine whether SNI is required by IP address, i.e., whether it belongs to a CDN that requires SNI. Popular CDNs like AWS publish lists of their public IPs. I use TLSv1.3 plus ESNI with Cloudflare but they are currently the only CDN that supports it. Experimental but works great, IME.
The proxy maintains the list not the browser. The proxy is designed for this and can easily hold lists of 10s of 1000s of domains in memory. That's more domains than I visit in one day, week, month or year.
Is it not a question of whether this is possible. "How would this work". I have already implemented it. It works. It is not difficult to set up.
Why this works for me and would unlikely work for others.
I am not a heavy user of popular browsers, I "live on the command line". Installing a custom root certificate with appropriate SANs to suppress browser warnings is a nusiance that would likely dissuade others since they are heavy users of those programs. However I generally do not use those browsers to retrieve content from the web.
Honestly now - who drafts and approves these specs? Not only does it make no sense whatsoever to encode such information this way - it also results in unimaginable amounts of bandwidth going to complete waste, on a planetary scale.
This is just plain incompetence. How did we let the technology powering the web devolve into this burning pile of nonsense?
Approves: no one.
Chrome just releases them in stable versions with little to no discussion, and the actual specs remain in draft stages.
Edit: grammar
I mean sure http being plaintext is silly but that's not down to the authors of this particular rfc.
Are google.com, youtube.com, netflix.com, facebook.com, amazon.com and reddit.com going to ask for User Agent Client Hints? If they're going to (which is more than likely, let's not kid ourselves) - I don't see how your point holds?
> "Why/how does this waste bandwidth?"
Based on the current proposal - non-mobile browsers or browsers that simply do not wish to expose the specific model are somehow required to return the following header in response:
Sec-CH-UA-Model: ""
Those are 19 absolutely useless bytes. Wouldn't it make more sense to simply omit the header from the response altogether? It would convey the exact same information to the server ("my Sec-CH-UA-Model is empty"), without the overhead of sending additional data.Possibly, but they only need them on initial request I think.
> It would convey the exact same information to the server ("my Sec-CH-UA-Model is empty"), without the overhead of sending additional data.
It doesn't commit the exact same info though. This says "the client is aware of this scheme and doesn't reply, vs the client is unaware of the rfc. It's the true/false/none issue.
In a sane world, there would be auch shorter way to encode that, but http is a bad protocol so you can't nest/namespace things or whatever.
Chrome came up with this? Figures. Stay evil, Google.
My secondary concern is that there would be more traffic going around the internet that isn't being used 99+% of the time.
1) JavaScript must be enabled. If it's not, then the server can't get any of the user agent data - at all.
2) The server won't get the user agent data until after it has already responded to the first request it receives from a client. That makes it a lot less useful overall. Having to load a page, then perhaps redirect the user using JS based on what the JS API says is a bit untidy.
It's one thing for the client to say "give me this resource in this format" its another for the server to say "oh you're coming from version X.Y of OS Z, I know what you really want."
Serving a slightly different web app from the same URI based upon other random metadata on the other hand. Makes caching all the more complicated.
I mostly think the replacement for user agent should be a boolean of mobile or not mobile. And everything else should be dynamically handled by the client.
https://mozilla.github.io/standards-positions/#ua-client-hin...
Authors of new Client Hints are advised to carefully consider whether
they need to be able to be added by client-side content (e.g.,
scripts) or whether the Client Hints need to be exclusively set by
the user agent. In the latter case, the Sec- prefix on the header
field name has the effect of preventing scripts and other application
content from setting them in user agents. Using the "Sec-" prefix
signals to servers that the user agent -- and not application content
-- generated the values. See [FETCH] for more information.
As near as I can tell, the bit they're talking about in the Fetch standard is just this: These are forbidden so the user agent remains in full control over them.
Names starting with `Sec-` are reserved to allow new headers to be minted
that are safe from APIs using fetch that allow control over headers by
developers, such as XMLHttpRequest.But that's 100% a guess on my part.
The assertion of Mozilla seems to be:
>At the time sites deploy a workaround, they can’t necessarily know what future browser version won’t have the need for the workaround. Can we guarantee only retrospective use? Do Web developers care enough about retrospective workarounds for evergreen browsers?
When there are significant numbers of users on devices like iPads that don't get updated any more, you can't rely on "evergreen browsers".
[0] - https://www.chromium.org/updates/same-site/incompatible-clie...
intentional?
Knowing the exact make and model of an Android device is a lot higher entropy than knowing the exact make and model of an iPhone.
That quote from the first comment on the issue is just a cherry on top.
Chrome 88 was released in December 2020. 7 months ago.
It's also a very good thing that Mozilla picked version 88. It had all the described problems and Chrome still shipped this draft spec with known issues enabled by default in the very next version.
v88 was the last version that had this behind a feature flag. Now that it's enabled by default, devs will rely on it and Chrome will refuse to change it because "once it's out we can't change it".
Good on Mozilla to call bullshit on Google (and not for the first time).