You cannot simply publicly access private secure links, can you? (opens in new tab)

(vin01.github.io)

420 pointsvin102y ago218 comments

218 comments

The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

Just last month, a story with a premise of discovering AWS account ids via buckets[0] did quite well on HN. The consensus established in the comments is that if you are relying on your account identifier being private as some form of security by obscurity, you are doing it wrong. The same concept applies here. This isn’t a novel security issue, this is just another method of dorking.

[0]: https://news.ycombinator.com/item?id=39512896

ta12432y ago

The problem is links leak.

In theory a 256 hex-character link (so 1024 bits) is near infinitely more secure than a 32 character username and 32 character password, as to guess it

https://site.com/[256chars]

As there's 2^1024 combinations. You'd never brute force it

https://site,com/[32chars] with a password of [32chars]

As there's 2^256 combinations. Again you can't brute force it, but it's more likely than the 2^1024 combinations.

Imagine it's

https://site,com/[32chars][32chars] instead.

But while guessing the former is harder than the latter, URLs leak a lot, far more than passwords.

internetter2y ago

Dorking is the technique of using public search engine indexes to uncover information that is presumed to be private. It has been used to uncover webcams, credit card numbers, confidential documents, and even spies.

The problem is the website administers who are encoding authentication tokens into URL state, not the naive crawlers that find them.

4 more replies

4death42y ago

Passwords are always private. Links are only sometimes private.

2 more replies

noahtallen2y ago

You can easily rate-limit an authentication attempt, to make brute-forcing account access practically impossible, even for a relatively insecure passwords.

How would you do that for the URLs? 5 requests to site.com/[256chars] which all 404 block your IP because you don't have a real link? I guess the security is relying on the fact that only a very a small percentage of the total possible links would be used? Though the likelihood of randomly guessing a link is the same as the % of addressable links used.

1 more reply

ablob2y ago

Which alphabet did you take as a basis to reach 2^256 combinations?

1 more reply

hiddencost2y ago

No. In theory they are both totally insecure.

Y_Y2y ago

> site-comma-com

Did you do that just to upset me?

masom2y ago

You won't find a specific link, but at some point if you generate millions of urls the 1024 bits will start to return values pretty quick through bruteforce.

The one link won't be found quickly, but a bunch of links will. You just need to fetch all possibilities and you'll get data.

4 more replies

bo10242y ago

There's probably details I'm missing, but I think the fundamental issue is that "private" messages between people are presumed private, but actually the platforms we use to send messages do read those messages and access links in them. (I mean messages in a very broad sense, including emails, DMs, pasted links in docs, etc.)

internetter2y ago

URL scanners are not scanning links contained within platforms that require access control. They haven't guessed your password, and to my knowledge no communications platform is feeding all links behind authentication into one of these public URL scanning databases. As the article acknowledged in the beginning, these links are either exposed as the result of deliberate user action, or misconfigured extensions (that, I might add, are suffering from this exact same misconception).

If the actual websites are configured to not use the URL as the authentication state, all this would be avoided

2 more replies

mikepurvis2y ago

Bit of a tangent, but I was recently advised by a consultant that pushing private Nix closures to a publicly-accessible S3 bucket was fine since each NAR file has a giant hash in the name. I didn't feel comfortable with it so we ended up going a different route, but I've continued to think about that since how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL?

And I think for me it comes down to the fact that the tokens can be issued on a per-customer basis, and access logs can be monitored to watch for suspicious behaviour and revoke accordingly.

Also, as others have mentioned, there's just a different mindset around how much it matters that the list of names of files be kept a secret. On the scale of things Amazon might randomly screw up, accidentally listing the filenames sitting in your public bucket sounds pretty low on the priority list since 99% of their users wouldn't care.

radlad2y ago

> how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL?

I'm not sure I grok this. Do you mean, for example, sending a token in the POST body, or as a cookie / other header?

One disadvantage to having a secret in the URL, versus in a header or body, is that it can appear in web service logs, unless you use a URI fragment. Even then, the URL is visible to the user, and will live in their history and URL bar - from which they may copy and paste it elsewhere.

1 more reply

nmadden2y ago

I wrote about putting secrets in URLs a few years ago: https://neilmadden.blog/2019/01/16/can-you-ever-safely-inclu...

1 more reply

cxr2y ago

> I've continued to think about that since how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL

Extremely different. The former depends on the existence of a contract about URL privacy (not to mention third parties actually adhering to it) when no such contract exists. Any design for an auth/auth mechanism that depends on private links is inherently broken. The very phrase "private link" is an oxymoron.

> I am not sure why you think that having an obscure URI format will somehow give you a secure call (whatever that means). Identifiers are public information.

<https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypert...>

XorNot2y ago

Worked for a company which ran into an S3 bucket naming collision when working with a client - turns out that both sides decided hyphenated-company-name was a good S3 bucket name (my company lost that race obviously).

One of those little informative pieces where everytime I do AWS now all the bucket names are usually named <project>-<deterministic hash from a seed value>.

If it's really meant to be private then you encrypt the project-name too and provide a script to list buckets with "friendly" names.

There's always a weird tradeoff with hosted services where technically the perfect thing (totally random identifiers) is too likely to mostly be an operational burden compared to the imperfect thing (descriptive names).

cj2y ago

What would encrypting the project name accomplish? Typically if you’re trying to secure a S3 bucket you’ll do that via bucket settings. Many years ago you had to jump through hoops to get things private, but these days there’s a big easy button to make a bucket inaccessible publicly.

1 more reply

bachmeier2y ago

> The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

Is there a difference between a private link containing a password and a link taking you to a site where you input the password? Bitwarden Send gives a link that you can hand out to others. It has # followed by a long random string. I'd like to know if there are security issues, because I use it regularly. At least with the link, I can kill it, and I can automatically have it die after a few days. Passwords generally don't work that way.

PeterisP2y ago

Yes, the difference is in what all our tools and infrastructure presume to be more or less sensitive.

Sending a GET request to a site for the password-input screen and POST'ing the password will get very different treatement than sending the same amount of "authorization bits" in the URL; in the first case, your browser won't store the secret in the history, the webserver and reverse proxy won't include it in their logs, various tools won't consider it appropriate to cache, etc, etc.

Our software infrastructure is built on an assumption that URLs aren't really sensitive, not like form content, and so they get far more sloppy treatment in many places.

If the secret URL is short-lived or preferably single-use-only (as e.g. many password reset links) then that's not an issue, but if you want to keep something secret long-term, then using it in an URL means it's very likely to get placed in various places which don't really try to keep things secret.

koolba2y ago

If there’s a live redirect at least there’s the option to revoke the access if the otherwise public link is leaked. I think that’s what sites like DocuSign do with their public links. You can always regenerate it and have it resent to the intended recipients email, but it expires after some fixed period of time to prevent it from being public forever.

79522y ago

There is a difference in that people intuitively know that entering passwords gives access. Also, it may be different legally as the user could reasonably be expected to know that they are not supposed to access something.

2 more replies

fddrdplktrew2y ago

legend.

r2b22y ago

To create private shareable links, store the private part in the hash of the URL. The hash is not transmitted in DNS queries or HTTP requests.

Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Ex. When links.com#<secret> is visited, the hash portion will not leave the browser.

Note: It's often nice to work with data in the hash portion by encoding it as a URL Safe Base64 string. (aka. JS Object ↔ JSON String ↔ URL Safe Base 64 String).

jmholla2y ago

> Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Note: When over HTTPS, the parameter string (and path) is encrypted so the intermediaries in question need to be able to decrypt your traffic to read that secret.

Everything else is right. Just wanted to provide some nuance.

r2b22y ago

Good to point out. This distinction is especially important to keep in mind when thinking about when and/or who terminates TLS/SSL for your service, and any relevant threat models the service might have for the portion of the HTTP request after terminattion.

mschuster912y ago

Cloudflare, Akamai, AWS Cloudfront are all legitimate intermediaries.

1 more reply

phyzome2y ago

Huge qualifier: Even otherwise benign Javascript running on that page can pass the fragment anywhere on the internet. Putting stuff in the fragment helps, but it's not perfect. And I don't just mean this in an ideal sense -- I've actually seen private tokens leak from the fragment this way multiple times.

eadmund2y ago

Which is yet another reason to disable Javascript by default: it can see everything on the page, and do anything with it, to include sending everything to some random server somewhere.

I am not completely opposed to scripting web pages (it’s a useful capability), but the vast majority of web pages are just styled text and images: Javascript adds nothing but vulnerability.

It would be awesome if something like HTMX were baked into browsers, and if enabling Javascript were something a user would have to do manually when visiting a page — just like Flash and Java applets back in the day.

andix2y ago

Is there a feature of DNS I'm unaware of, that queries more than just the domain part? https://example.com?token=<secret> should only lead to a DNS query with "example.com".

erikerikson2y ago

The problem isn't DNS in GP. DNS will happily supply the IP address for a CDN. The HTTP[S] request will thereafter be sent by the caller to the CDN (in the case of CloudFlare, Akamai, etc.) where it will be handled and potentially logged before the result is retrieved from the cache or the configured origin (i.e. backing server).

1 more reply

r2b22y ago

Correct, DNS only queries the hostname portion of the URL.

Maybe my attempt to be thorough – by making note of DNS along side HTTP since it's part of the browser ↔ network ↔ server request diagram – was too thorough.

klabb32y ago

Thanks, finally some thoughts about how to solve the issue. In particular, email based login/account reset is the main important use case I can think of.

Do bots that follow links in emails (for whatever reason) execute JS? Is there a risk they activate the thing with a JS induced POST?

r2b22y ago

To somewhat mitigate the link-loading bot issue, the link can land on a "confirm sign in" page with a button the user must click to trigger the POST request that completes authentication.

Another way to mitigate this issue is to store a secret in the browser that initiated the link-request (Ex. local storage). However, this can easily break in situations like private mode, where a new tab/window is opened without access to the same session storage.

An alternative to the in-browser-secret, is doing a browser fingerprint match. If the browser that opens the link doesn't match the fingerprint of the browser that requested the link, then fail authentication. This also has pitfalls.

Unfortunately, if your threat model requires blocking bots that click too, your likely stuck adding some semblance of a second factor (pin/password, bio metric, hardware key, etc.).

In any case, when using link-only authentication, best to at least put sensitive user operations (payments, PII, etc.) behind a second factor at the time of operation.

2 more replies

3695486848928262y ago

Yes, I've seen this bot JS problem, it does happen.

loginatnine2y ago

It's called a fragment FYI!

shiomiru2y ago

However, window.location calls it "hash". (Also, the query string is "search". I wonder why Netscape named them this way...)

1 more reply

SilasX2y ago

Yeah I was confused by it being referred to as the hash.

https://en.wikipedia.org/wiki/URI_fragment?useskin=vector

nightpool2y ago

The secret is still stored in the browser's history DB in this case, which may be unencrypted (I believe it is for Chrome on Windows last I checked). The cookie DB on the other hand I think is always encrypted using the OS's TPM so it's harder for malicious programs to crack

r2b22y ago

Yes, adding max-use counts and expiration dates to links can mitigate against some browser-history snooping. However, if your browser history is compromised you probably have an even bigger problem...

eterm2y ago

If it doesn't leave the browser, how would the server know to serve the private content?

jadengeller2y ago

Client web app makes POST request. It leaves browser, but not in URL

1 more reply

rpigab2y ago

Links that are not part of a fast redirect loop will be copied and pasted to be shared because that's what URLs are for, they're universal, they facilitate access to a resource available on a protocol.

Access control on anything that is not short-lived must be done outside of the url.

When you share links on any channel that is not e2ee, the first agent to access that url is not the person you're sending it to, it is the channel's service, it can be legitimate like Bitwarden looking for favicons to enhance UX, or malicious like FB Messenger crawler that wants to know more about what you are sharing in private messages.

Tools like these scanners won't get better UX, because if you explicitly tell users that the scans are public, some of them will think twice about using the service, and this is bad for business, wether they're using it for free or paying a pro license.

QuercusMax2y ago

I've always been a bit suspicious of infinite-use "private" links. It's just security thru obscurity. At least when you share a Google doc or something there's an option that explicitly says "anyone with the URL can access this".

Any systems I've built that need this type of thing have used Signed URLs with a short lifetime - usually only a few minutes. And the URLs are generally an implementation detail that's not directly shown to the user (although they can probably see them in the browser debug view).

empath-nirvana2y ago

There's functionally no difference between a private link and a link protected by a username and password or an api key, as long as the key space is large enough.

rfoo2y ago

Most of developers are aware that username or password are PII and if they log it they are likely to get fired.

Meanwhile our HTTP servers happily log every URI it received in access logs. Oh, and if you ever send a link in non E2EE messenger it's likely their server generated the link preview for you.

vel0city2y ago

There's one big functional difference. People don't normally have their username and password or API key directly in the URL.

Example 1:

Alice wants Bob to see CoolDocument. Alice generates a URL that has the snowflake in the URL and gives it to Bob. Eve manages to see the chat, and can now access the document.

Example 2:

Alice wants Bob to see CoolDocument. Alice clicks "Share with Bob" in the app, grabs the URL to the document with no authentication encoded within and sends it to Bob. Bob clicks the link, is prompted to login, Bob sees the document. Eve manages to see the chat, follows the link, but is unable to login and thus cannot see the document.

Later, Alice wants to revoke Bob's access to the document. Lots of platforms don't offer great tools to revoke individual generated share URLs, so it can be challenging to revoke Bob's access without potentially cutting off other people's access in Example 1, as that link might have been shared with multiple people. In example 2, Alice just removes Bob's access to the doucment and now his login doesn't have permissions to see it. Granted, better link management tools could sovle this, but it often seems like these snowflake systems don't really expose a lot of control over multiple share links.

1 more reply

nkrisc2y ago

There’s a big difference. The latter requires information not contained in the URL to access the information.

2 more replies

kriops2y ago

There is a big difference in how the browser treats the information, depending on how you provide it. Secrets in URLs leak more easily.

OtherShrezzing2y ago

There's at least one critical functional difference: The URL stays in the browser's history after it's been visited.

ses19842y ago

You can’t revoke an individual user’s access to a hard to guess link.

1 more reply

cxr2y ago

> There's functionally no difference between a private link and a link protected by a username and password or an api key

You mean mathematically there is no difference. Functionally there is a very, very big difference.

LaGrange2y ago

I mean, there's a functional difference if your email client will try to protect you by submitting the URL to a public database. Which is incredible and mind-boggling, but also apparently the world we live in.

voiper12y ago

>At least when you share a Google doc or something there's an option that explicitly says "anyone with the URL can access this".

Unfortunately, it's based on the document ID, so you can't re-enable access with a new URL.

nightpool2y ago

Not true, as you may have heard they closed this loophole in 2021 by adding a "resource key" (that can be rotated) to every shared URL: https://9to5google.com/2021/07/28/google-drive-security-upda....

scblock2y ago

When it comes to the internet if something like this is not protected by anything more than a random string in a URL then they aren't really private. Same story with all the internet connected web cams you can find if you go looking. I thought we knew this already. Why doesn't the "Who is responsible" section even mention this?

AnotherGoodName2y ago

Such links are very useful in an 'it's OK to have security match the use case' type of way. You don't need maximum security for everything. You just want a barrier to widespread sharing in some cases.

As an example i hit 'create link share' on a photo in my photo gallery and send someone the link to that photo. I don't want them to have to enter a password. I want the link to show the photo. It's ok for the link to do this. One of the examples they have here is exactly that and it's fine for that use case. In terms of privacy fears the end user could re-share a screenshot at that point anyway even if there was a login. The security matches the use case. The user now has a link to a photo, they could reshare but i trust they won't intentionally do this.

The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email and making them available to other users in that community. That's more re-sharing than i intended when i sent someone a photo.

nonrandomstring2y ago

> Such links are very useful in an 'it's OK to have security match the use case'

I think you give the most sensible summary. It's about "appropriate and proportional" security for the ease of use trade-off.

> the user now has a link to a photo, they could reshare but i trust they won't intentionally do this.

Time limits are something missing from most applications to create ephemeral links. Ideally you'd want to choose from something like 1 hour, 12 hours, 24 hours, 72 hours... Just resend if they miss the message and it expires.

A good trick is to set a cron job on your VPS to clear /www/tmp/ at midnight every other day.

> The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email

You have to consider anything sent to a recipient of Gmail, Microsoft, Apple - any of the commercial providers - to be immediately compromised. If sending between private domains on unencrypted email then it's immediately compromised by your friendly local intelligence agency. If using PGP or am E2E chat app, assume it _will_ be compromised at the end point eventually, so use an ephemeral link.

marcosdumay2y ago

The situation is greatly improved if you make the link short-lived and if you put the non-public data in a region of the URL that expects non-public data, like in the password, as in "https://anonymous:32_chars_hash@myphotolibrary.example.com/u...".

Terr_2y ago

A workaround for this "email-based authentication" problem (without going to a full "make an account with a password" step) is to use temporary one-time codes, so that it doesn't matter if the URL gets accidentally shared.

1. User visits "private" link (Or even a public link where they re-enter their e-mail.)

2. Site e-mails user again with time-limited single-use code.

3. User enters temporary code to confirm ownership of e-mail.

4. Flow proceeds (e.g. with HTTP cookies/session data) with reasonable certainty that the e-mail account owner is involved.

amanda992y ago

Off topic: but that links to cloudflare radar which apparently mines data from 1.1.1.1. I was under the impression that 1.1.1.1 did not use user data for any purposes?

kube-system2y ago

CF doesn't sell it or use it for marketing, but the entire way they even got the addresses was because APNIC wanted to study the garbage traffic to 1.1.1.1.

amanda992y ago

> CF doesn't sell it or use it for marketing

Any source for this? Do you work there? I checked their docs and they say they don't "mine user data", so I wouldn't trust anything they say, at least outside legal documents.

2 more replies

victorbjorklund2y ago

Can someone smarter explain to me what is different between?

1) domain.com/login user: John password: 5 char random password

2) domain.com/12 char random url

If we assume both either have the same bruteforce/rate limiting protection (or none at all). Why is 1 more safe than 2?

koliber2y ago

From the information theory angle, there is no difference.

In practice, there is.

There is a difference between something-you-have secrets and something-you-know secrets.

A UrL is something you have. It can be taken from you if you leave it somewhere accessible. Passwords are something-you-know and if managed well can not be taken (except for the lead pipe attack).

There is also something-you-are, which includes retina and fingerprint scans.

rkangel2y ago

This article is the exact reason why.

(1) Requires some out-of-band information to authenticate. Information that people are used to keeping safe.

On the other hand the URLs in (2) are handled as URLs. URLs are often logged, recorded, shared, passed around. E.g. your work firewall logging the username and password you used to log into a service would obviously be bad, but logging URLs you've accessed would probably seems fine.

[the latter case is just an example - the E2E guarantees of TLS mean that neither should be accessible]

amanda992y ago

Two things:

1. "Password" is a magic word that makes people less likely to just paste it into anything.

2. Username + passwords are two separate pieces of information that are not normally copy-pasted at the same time or have a canonical way of being stored next to each other.

victorbjorklund2y ago

1) Make sense. 2) Not sure about that. If someone shares their password with someone else they probably share both the username/email and the password

2 more replies

wetpaste2y ago

In the context of this article, it is that security scanning software that companies/users are using seem to be indexing some of the 12-char links out of emails which ends up in some cases on public scan. Additionally, if domain.com/12-char-password is requested without https, even if there is a redirect, that initial request went over the wire unencrypted and therefore could be MITM, whereas with a login page, there are more ways to guarantee that the password submit would only ever happen over https.

ApolloFortyNine2y ago

I researched this a while ago when I was curious if you could put auth tokens as query params.

One of the major issues is that many logging applications will log the full url somewhere, so now your logging 'passwords'.

laurels-marts2y ago

You can definitely pass JWT as a query param (and often are in embedded scenarios) and no its not the same as logging passwords unless you literally place the password in the payload (which would be stupid).

jarofgreen2y ago

As well as what the others have said, various bits of software make the assumption that 1) may be private and to be careful with it and 2) isn't.

eg Your web browser will automatically save any URLs to it's history for any user of the computer to see but will ask first before saving passwords.

eg Any web proxies your traffic goes through or other software that's looking like virus scanners will probably log URLs but probably won't log form contents (yes HTTPS makes this one more complicated but still).

munk-a2y ago

Assuming that 5 char password is done in a reasonable way then that data is not part of the publicly visible portion of the request that anyone along the chain of the communication can trivially eavesdrop. In a lot of cases that password even existing (even if there's no significant data there) will transform a request from a cacheable request into an uncacheable request so intermediate servers won't keep a copy of the response in case anyone else wants the document (there are other ways to do this but this will also force it to be the case).

kube-system2y ago

The difference is that people (and software that people write) often treat URLs differently than a password field. 12 characters might take X amount of time to brute force, but if you already have the 12 characters, that time drops to zero.

hawski2y ago

You can easily make a regex to filter out URLs. There is no universal regex (other than maybe costly LLM) to match the URL, the username and the password.

sbr4642y ago

All media/photos you upload to a private airtable.com app are public links. No authentication required if you know the url.

andix2y ago

There is a dilemma for web developers with images loaded from CDNs or APIs. Regular <img> tags can't set an Authorization header with a token for the request, like you can do with fetch() for API requests. The only possibility is adding a token to the URL or by using cookie authentication.

Cookie auth only works if the CDN is on the same domain, even a subdomain can be problematic in many cases.

internetter2y ago

This is actually fairly common for apps using CDNs – not just airtable. I agree it's potentially problematic

blue_green_maps2y ago

Yes, this is the case for images uploaded through GitHub comments, I think.

1 more reply

ttymck2y ago

Zoom meeting links often have the password appended as a query parameter. Is this link a "private secure" link? Is the link without the password "private secure"?

bombcar2y ago

If the password is randomized for each meeting, the URL link is not so bad, as the meeting will be dead and gone by the time the URL appears elsewhere.

But in reality, nobody actually cares and just wants a "click to join" that doesn't require fumbling around - but the previous "just use the meeting ID" was too easily guessed.

runeb2y ago

Unless its a recurring meeting

boxed2y ago

Outlook.com leaks links to bing. At work it's a constant attack surface that I have to block by looking at the user agent string. Thankfully they are honest in the user agent!

snthd2y ago

"private secure links" are indistinguishable from any other link.

With HTTP auth links you know the password is a password, so these tools would know which part to hide from public display:

> https://username:password@example.com/page

jeroenhd2y ago

I think it's quite funny that the URL spec has a section dedicated to authentication, only for web devs to invent ways to pass authentication data in any way but using the built-in security mechanism.

I know there are valid reasons (the "are you sure you want to log in as usernam on example.com?" prompt for example) but this is just one of the many ways web dev has built hacks upon hacks where implementing standards would've sufficed. See also: S3 vs WebDAV.

andix2y ago

A while ago I started to only send password protected links via email. Just with the plaintext password inside the email. This might seem absurd and unsafe on the first glance, but those kind of attacks it can safely prevent. Adding an expiration time is also a good idea, even if it is as long as a few months.

godelski2y ago

There's a clear UX problem here. If you submit a scan it doesn't tell you it is public.

There can be a helpful fix: make clear that the scan is public! When submitting a scan it isn't clear, as the article shows. But you have the opportunity to also tell the user that it is public during the scan, which takes time. You also have the opportunity to tell them AFTER the scan is done. There should be a clear button to delist.

urlscan.io does a bit better but the language is not quite clear that it means the scan is visible to the public. And the colors just blend in. If something isn't catching to your eye, it might as well be treated as invisible. If there is a way to easily misinterpret language, it will always be misinterpreted. if you have to scroll to find something, it'll never be found.

heipei2y ago

Thanks for your feedback. We show the Submit button on our front page as "Public Scan" to indicate that the scan results will be public. Once the scan has finished it will also contain the same colored banner that says "Public Scan". On each scan result page there is a "Report" button which will immediately de-list the scan result without any interaction from our side. If you have any ideas on how to make the experience more explicit I would be happy to hear it!

godelski2y ago

I understand, but that is not clear enough. "Public scan" can easily be misinterpreted. Honestly, when I looked at it, I didn't know what it meant. Just looked like idk maybe a mistranslation or something? Is it a scan for the public? Is the scanning done in public? Are the results public? Who knows. Remember that I'm not tech literate and didn't make the project.

I'd suggest having two buttons, "public scan" "private scan". That would contextualize the public scan to clarify and when you are scanning is publicly __listed__. And different colors. I think red for "public" would actually be the better choice.

Some information could be displayed while scanning. Idk put something like "did you know, using the public scan makes the link visible to others? This helps security researchers. You can delist it by clicking ____" or something like that and do the inverse. It should stand out. There's plenty of time while the scan happens.

> On each scan result page there is a "Report" button which will immediately de-list the scan result without any interaction from our side.

"Report" is not clear. That makes me think I want to report a problem. Also I think there is a problem with the color scheme. The pallet is nice but at least for myself, it all kinda blends in. Nothing pops. Which can be nice at times, but we want to draw the user to certain things, right? I actually didn't see the report button at first. I actually looked around, scrolled, and then even felt embarrassed when I did find it because it is in an "obvious" spot. One that I even looked at! (so extra embarrassing lol)

I think this is exactly one of those problems where when you build a tool everything seems obvious and taken care of. You clearly thought about these issues (far better than most!) but when we put things out into public, we need to see how they get used and where our assumptions miss the mark.

I do want to say thank you for making this. I am criticizing not to put you down or dismiss any of the work you've done. You've made a great tool that helps a lot of people. You should feel proud for that! I am criticizing because I want to help make the tool the best tool it can be. Of course these are my opinions. My suggestion would be to look at other opinions as well and see if there are common themes. Godelski isn't right, they're just one of many voices that you have to parse. Keep up the good work :)

2 more replies

dav432y ago

A classic one that has a business built on this is pidgeonhole - literally private links for events with people hosting internal company events and users posing private sometimes confidential information. And even banks sign on to these platforms!

kgeist2y ago

Tried it with the local alternative to Google Disk. Oh my... Immediately found lots of private data, including photos of credit cars (with security codes), scans of IDs, passports... How do you report a site?

JensRantil2y ago

I'm surprised no one has mentioned creating a standard that allows a these sites to check whether it's a private link or not.

For example, either a special HTTP header returned when making a HEAD request for the URL, or downloading a file similar to robots.txt that defines globs which are public/private.

At least this would (mostly) avoid these links becoming publicly available on the internetz.

egberts12y ago

Sure, you can!

This is the part where IP filtering by country and subnet can keep your ports hidden.

Also stateful firewall can be crafted to only let certain IP thru after sending a specially-crafted TOTP into a ICMP packet just to get into opening the firewall for your IP.

qudat2y ago

Over at pico.sh we are experimenting with an entirely new type of private link by leveraging ssh local forward tunnels: https://pgs.sh/

We are just getting started but so far we are loving the ergonomics.

652y ago

Well this is interesting. Even quickly searching "docs.google.com" on urlscan.io gets me some spreadsheets with lists of people's names, emails, telephone numbers, and other personal information.

getcrunk2y ago

What’s wrong with using signed urls and encrypting the object with a unique per user key. It’s adds some cpu time but if it’s encrypted it’s encrypted.

* this obviously assumes the objects have a 1-1 mapping with users

rvba2y ago

Reminds me how some would search for bitcoin wallets via google and kazaa.

On a side note, can someome remind me what was the name of the file, I think I have some tiny fraction of a bicoin on an old computer

figers2y ago

We have done one time use query string codes at the end of a URL sent to a user email address or as a text message to allow for this...

overstay89302y ago

Breaking news: Security by obscurity isn't actually security

panic2y ago

“Security by obscurity” means using custom, unvetted cryptographic algorithms that you believe others won’t be able to attack because they’re custom (and therefore obscure). Having a key you are supposed to keep hidden isn’t security by obscurity.

makapuf2y ago

Well, I like my password/ssh private key to be kept in obscurity.

fiddlerwoaroof2y ago

Yeah, I’ve always hated this saying because all security involves something that is kept secret, or “obscure”. Also, obscurity is a valid element of a defense in depth strategy

3 more replies

overstay89302y ago

If you use an HSM you wouldn’t have to worry about that either

zzz9992y ago

You can if you use E2EE and not CAs

BobbyTables22y ago

What happened to REST design principles?

A GET isn’t supposed to modify server state. That is reserved for POST, PUT, PATCH…

j / k navigate · click thread line to collapse

218 comments

internetter2y ago

The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

[0]: https://news.ycombinator.com/item?id=39512896

ta12432y ago

The problem is links leak.

In theory a 256 hex-character link (so 1024 bits) is near infinitely more secure than a 32 character username and 32 character password, as to guess it

https://site.com/[256chars]

As there's 2^1024 combinations. You'd never brute force it

https://site,com/[32chars] with a password of [32chars]

As there's 2^256 combinations. Again you can't brute force it, but it's more likely than the 2^1024 combinations.

Imagine it's

https://site,com/[32chars][32chars] instead.

But while guessing the former is harder than the latter, URLs leak a lot, far more than passwords.

internetter2y ago

The problem is the website administers who are encoding authentication tokens into URL state, not the naive crawlers that find them.

4 more replies

4death42y ago

Passwords are always private. Links are only sometimes private.

2 more replies

noahtallen2y ago

You can easily rate-limit an authentication attempt, to make brute-forcing account access practically impossible, even for a relatively insecure passwords.

1 more reply

ablob2y ago

Which alphabet did you take as a basis to reach 2^256 combinations?

1 more reply

hiddencost2y ago

No. In theory they are both totally insecure.

Y_Y2y ago

> site-comma-com

Did you do that just to upset me?

masom2y ago

You won't find a specific link, but at some point if you generate millions of urls the 1024 bits will start to return values pretty quick through bruteforce.

The one link won't be found quickly, but a bunch of links will. You just need to fetch all possibilities and you'll get data.

4 more replies

bo10242y ago

internetter2y ago

If the actual websites are configured to not use the URL as the authentication state, all this would be avoided

2 more replies

mikepurvis2y ago

And I think for me it comes down to the fact that the tokens can be issued on a per-customer basis, and access logs can be monitored to watch for suspicious behaviour and revoke accordingly.

radlad2y ago

> how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL?

I'm not sure I grok this. Do you mean, for example, sending a token in the POST body, or as a cookie / other header?

1 more reply

nmadden2y ago

I wrote about putting secrets in URLs a few years ago: https://neilmadden.blog/2019/01/16/can-you-ever-safely-inclu...

1 more reply

cxr2y ago

> I've continued to think about that since how different is it really to have the "secret" be in the URL vs in a token you submit as part of the request for the URL

> I am not sure why you think that having an obscure URI format will somehow give you a secure call (whatever that means). Identifiers are public information.

<https://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypert...>

XorNot2y ago

One of those little informative pieces where everytime I do AWS now all the bucket names are usually named <project>-<deterministic hash from a seed value>.

If it's really meant to be private then you encrypt the project-name too and provide a script to list buckets with "friendly" names.

cj2y ago

1 more reply

bachmeier2y ago

> The fundamental issue is that links without any form of access control are presumed private, simply because there is no public index of the available identifiers.

PeterisP2y ago

Yes, the difference is in what all our tools and infrastructure presume to be more or less sensitive.

Our software infrastructure is built on an assumption that URLs aren't really sensitive, not like form content, and so they get far more sloppy treatment in many places.

koolba2y ago

79522y ago

2 more replies

fddrdplktrew2y ago

legend.

r2b22y ago

To create private shareable links, store the private part in the hash of the URL. The hash is not transmitted in DNS queries or HTTP requests.

Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Ex. When links.com#<secret> is visited, the hash portion will not leave the browser.

Note: It's often nice to work with data in the hash portion by encoding it as a URL Safe Base64 string. (aka. JS Object ↔ JSON String ↔ URL Safe Base 64 String).

jmholla2y ago

> Ex. When links.com?token=<secret> is visited, that link will be transmitted and potentially saved (search parameters included) by intermediaries like Cloud Flare.

Note: When over HTTPS, the parameter string (and path) is encrypted so the intermediaries in question need to be able to decrypt your traffic to read that secret.

Everything else is right. Just wanted to provide some nuance.

r2b22y ago

mschuster912y ago

Cloudflare, Akamai, AWS Cloudfront are all legitimate intermediaries.

1 more reply

phyzome2y ago

eadmund2y ago

Which is yet another reason to disable Javascript by default: it can see everything on the page, and do anything with it, to include sending everything to some random server somewhere.

I am not completely opposed to scripting web pages (it’s a useful capability), but the vast majority of web pages are just styled text and images: Javascript adds nothing but vulnerability.

andix2y ago

Is there a feature of DNS I'm unaware of, that queries more than just the domain part? https://example.com?token=<secret> should only lead to a DNS query with "example.com".

erikerikson2y ago

1 more reply

r2b22y ago

Correct, DNS only queries the hostname portion of the URL.

Maybe my attempt to be thorough – by making note of DNS along side HTTP since it's part of the browser ↔ network ↔ server request diagram – was too thorough.

klabb32y ago

Thanks, finally some thoughts about how to solve the issue. In particular, email based login/account reset is the main important use case I can think of.

Do bots that follow links in emails (for whatever reason) execute JS? Is there a risk they activate the thing with a JS induced POST?

r2b22y ago

To somewhat mitigate the link-loading bot issue, the link can land on a "confirm sign in" page with a button the user must click to trigger the POST request that completes authentication.

Unfortunately, if your threat model requires blocking bots that click too, your likely stuck adding some semblance of a second factor (pin/password, bio metric, hardware key, etc.).

In any case, when using link-only authentication, best to at least put sensitive user operations (payments, PII, etc.) behind a second factor at the time of operation.

2 more replies

3695486848928262y ago

Yes, I've seen this bot JS problem, it does happen.

loginatnine2y ago

It's called a fragment FYI!

shiomiru2y ago

However, window.location calls it "hash". (Also, the query string is "search". I wonder why Netscape named them this way...)

1 more reply

SilasX2y ago

Yeah I was confused by it being referred to as the hash.

https://en.wikipedia.org/wiki/URI_fragment?useskin=vector

nightpool2y ago

r2b22y ago

eterm2y ago

If it doesn't leave the browser, how would the server know to serve the private content?

jadengeller2y ago

Client web app makes POST request. It leaves browser, but not in URL

1 more reply

rpigab2y ago

Access control on anything that is not short-lived must be done outside of the url.

QuercusMax2y ago

empath-nirvana2y ago

There's functionally no difference between a private link and a link protected by a username and password or an api key, as long as the key space is large enough.

rfoo2y ago

Most of developers are aware that username or password are PII and if they log it they are likely to get fired.

Meanwhile our HTTP servers happily log every URI it received in access logs. Oh, and if you ever send a link in non E2EE messenger it's likely their server generated the link preview for you.

vel0city2y ago

There's one big functional difference. People don't normally have their username and password or API key directly in the URL.

Example 1:

Alice wants Bob to see CoolDocument. Alice generates a URL that has the snowflake in the URL and gives it to Bob. Eve manages to see the chat, and can now access the document.

Example 2:

1 more reply

nkrisc2y ago

There’s a big difference. The latter requires information not contained in the URL to access the information.

2 more replies

kriops2y ago

There is a big difference in how the browser treats the information, depending on how you provide it. Secrets in URLs leak more easily.

OtherShrezzing2y ago

There's at least one critical functional difference: The URL stays in the browser's history after it's been visited.

ses19842y ago

You can’t revoke an individual user’s access to a hard to guess link.

1 more reply

cxr2y ago

> There's functionally no difference between a private link and a link protected by a username and password or an api key

You mean mathematically there is no difference. Functionally there is a very, very big difference.

LaGrange2y ago

voiper12y ago

>At least when you share a Google doc or something there's an option that explicitly says "anyone with the URL can access this".

Unfortunately, it's based on the document ID, so you can't re-enable access with a new URL.

nightpool2y ago

scblock2y ago

AnotherGoodName2y ago

nonrandomstring2y ago

> Such links are very useful in an 'it's OK to have security match the use case'

I think you give the most sensible summary. It's about "appropriate and proportional" security for the ease of use trade-off.

> the user now has a link to a photo, they could reshare but i trust they won't intentionally do this.

A good trick is to set a cron job on your VPS to clear /www/tmp/ at midnight every other day.

> The big issue here isn't the links imho. It's the security analysis tools scanning all links a user received via email

marcosdumay2y ago

Terr_2y ago

1. User visits "private" link (Or even a public link where they re-enter their e-mail.)

2. Site e-mails user again with time-limited single-use code.

3. User enters temporary code to confirm ownership of e-mail.

4. Flow proceeds (e.g. with HTTP cookies/session data) with reasonable certainty that the e-mail account owner is involved.

amanda992y ago

Off topic: but that links to cloudflare radar which apparently mines data from 1.1.1.1. I was under the impression that 1.1.1.1 did not use user data for any purposes?

kube-system2y ago

CF doesn't sell it or use it for marketing, but the entire way they even got the addresses was because APNIC wanted to study the garbage traffic to 1.1.1.1.

amanda992y ago

> CF doesn't sell it or use it for marketing

Any source for this? Do you work there? I checked their docs and they say they don't "mine user data", so I wouldn't trust anything they say, at least outside legal documents.

2 more replies

victorbjorklund2y ago

Can someone smarter explain to me what is different between?

1) domain.com/login user: John password: 5 char random password

2) domain.com/12 char random url

If we assume both either have the same bruteforce/rate limiting protection (or none at all). Why is 1 more safe than 2?

koliber2y ago

From the information theory angle, there is no difference.

In practice, there is.

There is a difference between something-you-have secrets and something-you-know secrets.

A UrL is something you have. It can be taken from you if you leave it somewhere accessible. Passwords are something-you-know and if managed well can not be taken (except for the lead pipe attack).

There is also something-you-are, which includes retina and fingerprint scans.

rkangel2y ago

This article is the exact reason why.

(1) Requires some out-of-band information to authenticate. Information that people are used to keeping safe.

[the latter case is just an example - the E2E guarantees of TLS mean that neither should be accessible]

amanda992y ago

Two things:

1. "Password" is a magic word that makes people less likely to just paste it into anything.

2. Username + passwords are two separate pieces of information that are not normally copy-pasted at the same time or have a canonical way of being stored next to each other.

victorbjorklund2y ago

1) Make sense. 2) Not sure about that. If someone shares their password with someone else they probably share both the username/email and the password

2 more replies

wetpaste2y ago

ApolloFortyNine2y ago

I researched this a while ago when I was curious if you could put auth tokens as query params.

One of the major issues is that many logging applications will log the full url somewhere, so now your logging 'passwords'.

laurels-marts2y ago

jarofgreen2y ago

As well as what the others have said, various bits of software make the assumption that 1) may be private and to be careful with it and 2) isn't.

eg Your web browser will automatically save any URLs to it's history for any user of the computer to see but will ask first before saving passwords.

munk-a2y ago

kube-system2y ago

hawski2y ago

You can easily make a regex to filter out URLs. There is no universal regex (other than maybe costly LLM) to match the URL, the username and the password.

sbr4642y ago

All media/photos you upload to a private airtable.com app are public links. No authentication required if you know the url.

andix2y ago

Cookie auth only works if the CDN is on the same domain, even a subdomain can be problematic in many cases.

internetter2y ago

This is actually fairly common for apps using CDNs – not just airtable. I agree it's potentially problematic

blue_green_maps2y ago

Yes, this is the case for images uploaded through GitHub comments, I think.

1 more reply

ttymck2y ago

Zoom meeting links often have the password appended as a query parameter. Is this link a "private secure" link? Is the link without the password "private secure"?

bombcar2y ago

If the password is randomized for each meeting, the URL link is not so bad, as the meeting will be dead and gone by the time the URL appears elsewhere.

But in reality, nobody actually cares and just wants a "click to join" that doesn't require fumbling around - but the previous "just use the meeting ID" was too easily guessed.

runeb2y ago

Unless its a recurring meeting

boxed2y ago

Outlook.com leaks links to bing. At work it's a constant attack surface that I have to block by looking at the user agent string. Thankfully they are honest in the user agent!

snthd2y ago

"private secure links" are indistinguishable from any other link.

With HTTP auth links you know the password is a password, so these tools would know which part to hide from public display:

> https://username:password@example.com/page

jeroenhd2y ago

andix2y ago

godelski2y ago

There's a clear UX problem here. If you submit a scan it doesn't tell you it is public.

heipei2y ago

godelski2y ago

> On each scan result page there is a "Report" button which will immediately de-list the scan result without any interaction from our side.

2 more replies

dav432y ago

kgeist2y ago

JensRantil2y ago

I'm surprised no one has mentioned creating a standard that allows a these sites to check whether it's a private link or not.

For example, either a special HTTP header returned when making a HEAD request for the URL, or downloading a file similar to robots.txt that defines globs which are public/private.

At least this would (mostly) avoid these links becoming publicly available on the internetz.

egberts12y ago

Sure, you can!

This is the part where IP filtering by country and subnet can keep your ports hidden.

Also stateful firewall can be crafted to only let certain IP thru after sending a specially-crafted TOTP into a ICMP packet just to get into opening the firewall for your IP.

qudat2y ago

Over at pico.sh we are experimenting with an entirely new type of private link by leveraging ssh local forward tunnels: https://pgs.sh/

We are just getting started but so far we are loving the ergonomics.

652y ago

Well this is interesting. Even quickly searching "docs.google.com" on urlscan.io gets me some spreadsheets with lists of people's names, emails, telephone numbers, and other personal information.

getcrunk2y ago

What’s wrong with using signed urls and encrypting the object with a unique per user key. It’s adds some cpu time but if it’s encrypted it’s encrypted.

* this obviously assumes the objects have a 1-1 mapping with users

rvba2y ago

Reminds me how some would search for bitcoin wallets via google and kazaa.

On a side note, can someome remind me what was the name of the file, I think I have some tiny fraction of a bicoin on an old computer

figers2y ago

We have done one time use query string codes at the end of a URL sent to a user email address or as a text message to allow for this...

overstay89302y ago

Breaking news: Security by obscurity isn't actually security

panic2y ago

makapuf2y ago

Well, I like my password/ssh private key to be kept in obscurity.

fiddlerwoaroof2y ago

Yeah, I’ve always hated this saying because all security involves something that is kept secret, or “obscure”. Also, obscurity is a valid element of a defense in depth strategy

3 more replies

overstay89302y ago

If you use an HSM you wouldn’t have to worry about that either

zzz9992y ago

You can if you use E2EE and not CAs

BobbyTables22y ago

What happened to REST design principles?

A GET isn’t supposed to modify server state. That is reserved for POST, PUT, PATCH…

j / k navigate · click thread line to collapse