Just last month, a story with a premise of discovering AWS account ids via buckets[0] did quite well on HN. The consensus established in the comments is that if you are relying on your account identifier being private as some form of security by obscurity, you are doing it wrong. The same concept applies here. This isn’t a novel security issue, this is just another method of dorking.
In theory a 256 hex-character link (so 1024 bits) is near infinitely more secure than a 32 character username and 32 character password, as to guess it
As there's 2^1024 combinations. You'd never brute force it
vs
https://site,com/[32chars] with a password of [32chars]
As there's 2^256 combinations. Again you can't brute force it, but it's more likely than the 2^1024 combinations.
Imagine it's
https://site,com/[32chars][32chars] instead.
But while guessing the former is harder than the latter, URLs leak a lot, far more than passwords.
The problem is the website administers who are encoding authentication tokens into URL state, not the naive crawlers that find them.
How would you do that for the URLs? 5 requests to site.com/[256chars] which all 404 block your IP because you don't have a real link? I guess the security is relying on the fact that only a very a small percentage of the total possible links would be used? Though the likelihood of randomly guessing a link is the same as the % of addressable links used.
Did you do that just to upset me?
The one link won't be found quickly, but a bunch of links will. You just need to fetch all possibilities and you'll get data.
If the actual websites are configured to not use the URL as the authentication state, all this would be avoided
And I think for me it comes down to the fact that the tokens can be issued on a per-customer basis, and access logs can be monitored to watch for suspicious behaviour and revoke accordingly.
Also, as others have mentioned, there's just a different mindset around how much it matters that the list of names of files be kept a secret. On the scale of things Amazon might randomly screw up, accidentally listing the filenames sitting in your public bucket sounds pretty low on the priority list since 99% of their users wouldn't care.
I'm not sure I grok this. Do you mean, for example, sending a token in the POST body, or as a cookie / other header?
One disadvantage to having a secret in the URL, versus in a header or body, is that it can appear in web service logs, unless you use a URI fragment. Even then, the URL is visible to the user, and will live in their history and URL bar - from which they may copy and paste it elsewhere.
Extremely different. The former depends on the existence of a contract about URL privacy (not to mention third parties actually adhering to it) when no such contract exists. Any design for an auth/auth mechanism that depends on private links is inherently broken. The very phrase "private link" is an oxymoron.
> I am not sure why you think that having an obscure URI format will somehow give you a secure call (whatever that means). Identifiers are public information.
<https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypert...>
One of those little informative pieces where everytime I do AWS now all the bucket names are usually named <project>-<deterministic hash from a seed value>.
If it's really meant to be private then you encrypt the project-name too and provide a script to list buckets with "friendly" names.
There's always a weird tradeoff with hosted services where technically the perfect thing (totally random identifiers) is too likely to mostly be an operational burden compared to the imperfect thing (descriptive names).
Is there a difference between a private link containing a password and a link taking you to a site where you input the password? Bitwarden Send gives a link that you can hand out to others. It has # followed by a long random string. I'd like to know if there are security issues, because I use it regularly. At least with the link, I can kill it, and I can automatically have it die after a few days. Passwords generally don't work that way.
Sending a GET request to a site for the password-input screen and POST'ing the password will get very different treatement than sending the same amount of "authorization bits" in the URL; in the first case, your browser won't store the secret in the history, the webserver and reverse proxy won't include it in their logs, various tools won't consider it appropriate to cache, etc, etc.
Our software infrastructure is built on an assumption that URLs aren't really sensitive, not like form content, and so they get far more sloppy treatment in many places.
If the secret URL is short-lived or preferably single-use-only (as e.g. many password reset links) then that's not an issue, but if you want to keep something secret long-term, then using it in an URL means it's very likely to get placed in various places which don't really try to keep things secret.
Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.
Ex. When links.com#<secret> is visited, the hash portion will not leave the browser.
Note: It's often nice to work with data in the hash portion by encoding it as a URL Safe Base64 string. (aka. JS Object ↔ JSON String ↔ URL Safe Base 64 String).
Note: When over HTTPS, the parameter string (and path) is encrypted so the intermediaries in question need to be able to decrypt your traffic to read that secret.
Everything else is right. Just wanted to provide some nuance.
I am not completely opposed to scripting web pages (it’s a useful capability), but the vast majority of web pages are just styled text and images: Javascript adds nothing but vulnerability.
It would be awesome if something like HTMX were baked into browsers, and if enabling Javascript were something a user would have to do manually when visiting a page — just like Flash and Java applets back in the day.
Maybe my attempt to be thorough – by making note of DNS along side HTTP since it's part of the browser ↔ network ↔ server request diagram – was too thorough.
Do bots that follow links in emails (for whatever reason) execute JS? Is there a risk they activate the thing with a JS induced POST?
Another way to mitigate this issue is to store a secret in the browser that initiated the link-request (Ex. local storage). However, this can easily break in situations like private mode, where a new tab/window is opened without access to the same session storage.
An alternative to the in-browser-secret, is doing a browser fingerprint match. If the browser that opens the link doesn't match the fingerprint of the browser that requested the link, then fail authentication. This also has pitfalls.
Unfortunately, if your threat model requires blocking bots that click too, your likely stuck adding some semblance of a second factor (pin/password, bio metric, hardware key, etc.).
In any case, when using link-only authentication, best to at least put sensitive user operations (payments, PII, etc.) behind a second factor at the time of operation.
Access control on anything that is not short-lived must be done outside of the url.
When you share links on any channel that is not e2ee, the first agent to access that url is not the person you're sending it to, it is the channel's service, it can be legitimate like Bitwarden looking for favicons to enhance UX, or malicious like FB Messenger crawler that wants to know more about what you are sharing in private messages.
Tools like these scanners won't get better UX, because if you explicitly tell users that the scans are public, some of them will think twice about using the service, and this is bad for business, wether they're using it for free or paying a pro license.
Any systems I've built that need this type of thing have used Signed URLs with a short lifetime - usually only a few minutes. And the URLs are generally an implementation detail that's not directly shown to the user (although they can probably see them in the browser debug view).
Meanwhile our HTTP servers happily log every URI it received in access logs. Oh, and if you ever send a link in non E2EE messenger it's likely their server generated the link preview for you.
Example 1:
Alice wants Bob to see CoolDocument. Alice generates a URL that has the snowflake in the URL and gives it to Bob. Eve manages to see the chat, and can now access the document.
Example 2:
Alice wants Bob to see CoolDocument. Alice clicks "Share with Bob" in the app, grabs the URL to the document with no authentication encoded within and sends it to Bob. Bob clicks the link, is prompted to login, Bob sees the document. Eve manages to see the chat, follows the link, but is unable to login and thus cannot see the document.
Later, Alice wants to revoke Bob's access to the document. Lots of platforms don't offer great tools to revoke individual generated share URLs, so it can be challenging to revoke Bob's access without potentially cutting off other people's access in Example 1, as that link might have been shared with multiple people. In example 2, Alice just removes Bob's access to the doucment and now his login doesn't have permissions to see it. Granted, better link management tools could sovle this, but it often seems like these snowflake systems don't really expose a lot of control over multiple share links.
You mean mathematically there is no difference. Functionally there is a very, very big difference.
Unfortunately, it's based on the document ID, so you can't re-enable access with a new URL.
As an example i hit 'create link share' on a photo in my photo gallery and send someone the link to that photo. I don't want them to have to enter a password. I want the link to show the photo. It's ok for the link to do this. One of the examples they have here is exactly that and it's fine for that use case. In terms of privacy fears the end user could re-share a screenshot at that point anyway even if there was a login. The security matches the use case. The user now has a link to a photo, they could reshare but i trust they won't intentionally do this.
The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email and making them available to other users in that community. That's more re-sharing than i intended when i sent someone a photo.
I think you give the most sensible summary. It's about "appropriate and proportional" security for the ease of use trade-off.
> the user now has a link to a photo, they could reshare but i trust they won't intentionally do this.
Time limits are something missing from most applications to create ephemeral links. Ideally you'd want to choose from something like 1 hour, 12 hours, 24 hours, 72 hours... Just resend if they miss the message and it expires.
A good trick is to set a cron job on your VPS to clear /www/tmp/ at midnight every other day.
> The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email
You have to consider anything sent to a recipient of Gmail, Microsoft, Apple - any of the commercial providers - to be immediately compromised. If sending between private domains on unencrypted email then it's immediately compromised by your friendly local intelligence agency. If using PGP or am E2E chat app, assume it _will_ be compromised at the end point eventually, so use an ephemeral link.
1. User visits "private" link (Or even a public link where they re-enter their e-mail.)
2. Site e-mails user again with time-limited single-use code.
3. User enters temporary code to confirm ownership of e-mail.
4. Flow proceeds (e.g. with HTTP cookies/session data) with reasonable certainty that the e-mail account owner is involved.
Any source for this? Do you work there? I checked their docs and they say they don't "mine user data", so I wouldn't trust anything they say, at least outside legal documents.
1) domain.com/login user: John password: 5 char random password
2) domain.com/12 char random url
If we assume both either have the same bruteforce/rate limiting protection (or none at all). Why is 1 more safe than 2?
In practice, there is.
There is a difference between something-you-have secrets and something-you-know secrets.
A UrL is something you have. It can be taken from you if you leave it somewhere accessible. Passwords are something-you-know and if managed well can not be taken (except for the lead pipe attack).
There is also something-you-are, which includes retina and fingerprint scans.
(1) Requires some out-of-band information to authenticate. Information that people are used to keeping safe.
On the other hand the URLs in (2) are handled as URLs. URLs are often logged, recorded, shared, passed around. E.g. your work firewall logging the username and password you used to log into a service would obviously be bad, but logging URLs you've accessed would probably seems fine.
[the latter case is just an example - the E2E guarantees of TLS mean that neither should be accessible]
1. "Password" is a magic word that makes people less likely to just paste it into anything.
2. Username + passwords are two separate pieces of information that are not normally copy-pasted at the same time or have a canonical way of being stored next to each other.
One of the major issues is that many logging applications will log the full url somewhere, so now your logging 'passwords'.
eg Your web browser will automatically save any URLs to it's history for any user of the computer to see but will ask first before saving passwords.
eg Any web proxies your traffic goes through or other software that's looking like virus scanners will probably log URLs but probably won't log form contents (yes HTTPS makes this one more complicated but still).
Cookie auth only works if the CDN is on the same domain, even a subdomain can be problematic in many cases.
But in reality, nobody actually cares and just wants a "click to join" that doesn't require fumbling around - but the previous "just use the meeting ID" was too easily guessed.
With HTTP auth links you know the password is a password, so these tools would know which part to hide from public display:
I know there are valid reasons (the "are you sure you want to log in as usernam on example.com?" prompt for example) but this is just one of the many ways web dev has built hacks upon hacks where implementing standards would've sufficed. See also: S3 vs WebDAV.
There can be a helpful fix: make clear that the scan is public! When submitting a scan it isn't clear, as the article shows. But you have the opportunity to also tell the user that it is public during the scan, which takes time. You also have the opportunity to tell them AFTER the scan is done. There should be a clear button to delist.
urlscan.io does a bit better but the language is not quite clear that it means the scan is visible to the public. And the colors just blend in. If something isn't catching to your eye, it might as well be treated as invisible. If there is a way to easily misinterpret language, it will always be misinterpreted. if you have to scroll to find something, it'll never be found.
I'd suggest having two buttons, "public scan" "private scan". That would contextualize the public scan to clarify and when you are scanning is publicly __listed__. And different colors. I think red for "public" would actually be the better choice.
Some information could be displayed while scanning. Idk put something like "did you know, using the public scan makes the link visible to others? This helps security researchers. You can delist it by clicking ____" or something like that and do the inverse. It should stand out. There's plenty of time while the scan happens.
> On each scan result page there is a "Report" button which will immediately de-list the scan result without any interaction from our side.
"Report" is not clear. That makes me think I want to report a problem. Also I think there is a problem with the color scheme. The pallet is nice but at least for myself, it all kinda blends in. Nothing pops. Which can be nice at times, but we want to draw the user to certain things, right? I actually didn't see the report button at first. I actually looked around, scrolled, and then even felt embarrassed when I did find it because it is in an "obvious" spot. One that I even looked at! (so extra embarrassing lol)
I think this is exactly one of those problems where when you build a tool everything seems obvious and taken care of. You clearly thought about these issues (far better than most!) but when we put things out into public, we need to see how they get used and where our assumptions miss the mark.
I do want to say thank you for making this. I am criticizing not to put you down or dismiss any of the work you've done. You've made a great tool that helps a lot of people. You should feel proud for that! I am criticizing because I want to help make the tool the best tool it can be. Of course these are my opinions. My suggestion would be to look at other opinions as well and see if there are common themes. Godelski isn't right, they're just one of many voices that you have to parse. Keep up the good work :)
For example, either a special HTTP header returned when making a HEAD request for the URL, or downloading a file similar to robots.txt that defines globs which are public/private.
At least this would (mostly) avoid these links becoming publicly available on the internetz.
This is the part where IP filtering by country and subnet can keep your ports hidden.
Also stateful firewall can be crafted to only let certain IP thru after sending a specially-crafted TOTP into a ICMP packet just to get into opening the firewall for your IP.
We are just getting started but so far we are loving the ergonomics.
* this obviously assumes the objects have a 1-1 mapping with users
On a side note, can someome remind me what was the name of the file, I think I have some tiny fraction of a bicoin on an old computer
A GET isn’t supposed to modify server state. That is reserved for POST, PUT, PATCH…