OPSEC is hard.
I'm not sure how we could explain to avoid it - where would the explanation go? Visiting that page would be just as much of a correlation, no? It's kind of a chicken and egg problem, unless the source is already using Tor.
Avoiding the "trail of the SSL connection" also suggests we should be doing something to combat website fingerprinting, which we have discussed but do not have a clear solution for yet.
Our current thinking is that just visiting the landing page is not enough to prosecute a source. We can do better, and are working on it, but it's difficult.
1. Make the entire site available under `ssl.washingtonpost.com` (ideally without the `.ssl` prefix).
That way, the domain won't be as suspicious as it is right now. I suspect that this is more or less the only content hosted on the domain.
2. Include an iframe for all (or a random subset of) visitors, loading this particular url (hidden).
By artificially generating traffic to the endpoint it will be harder to distinguish these from other, 'real' requests.
Use a random delay for adding the iframe (otherwise the 'pairing' with the initial http request may distinguish this traffic).
3. Print the link, url and info block on the dead trees (the paper), as other has suggested.
4. Add HSTS headers (http://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security)
Print it in the physical newspaper. The German computer magazine c't prints their PGP key fingerprints in the masthead.
You could put the instructions on pages that many people visit regularly, true security through obscurity. For example, put the instructions in abbreviated form in a box in the footer of your front page (or in the footer of every page).
This may be one of the rare cases where the use of a QR Code is justified.
I agree it would probably be a good idea to put a warning about such a problem though.
Ultimately, we can write descriptive documentation - but getting it read and understood is hard. Cryptoparties, are again a great idea, but getting the non-technical user involved is damned hard.
IMHO these things always come down to "how do we make it easy for the public, whilst keeping it REALLY secure". How does security become a general piece of education, much akin to math, or at least history?
That should make it hard enough to correlate any data, I guess they have enough visitors.
You only need the two "leak at time X, IP Y loaded this page at time X-5" datapoints to break this.
An embedded page is not fetched by someone else.
I don't think you can connect visiting the info page and the very next SecureDrop file upload.
* Your actions could put you on a shortlist of people to be more thoroughly investigated.
* Your actions could tip off the people whom your information threatens; maybe they stop communicating with you (or worse) to shut off the leak.
* Per the Snowden release, the NSA tracked the communications of people within something like 3 degrees of their targets. With standards that low, it's not a stretch to think someone would track everyone visiting the Washington Post's secure drop box.
And "a group of 100 IPs including a coffee shop near NSA employee John Smith's home" is enough.
I wrote up an analysis of exactly this problem last year: http://grugq.github.io/blog/2013/12/21/in-search-of-opsec-ma...
That wasn't proof, of course, but it didn't need to be proof, just a good lead for law enforcement to kick-start their investigation.
"Download and install the Tor browser bundle from Download and install the Tor browser bundle from https://www.torproject.org/" should be "Download and install the Tor browser bundle from https://www.torproject.org/"
"You will be provided with a codename that you will use it to log in to check for replies from The Post." should not have the word "it".
Otherwise, great work! I'm really glad that you're doing this and featuring it prominently on your home page.
The requirement for security is to make successful attacks more expensive than they are worth for the attackers. (There is no perfect security, of course.)
How much is information leaked to the WP worth? It's information that can change the course of history; it could make war or peace; it could be worth billions or even trillions of dollars; it could simply change the course of the stock market or of one stock and be worth billions to an individual.
If I ran a state intelligence service, with the fate of my nation and all my citizens in my hands, I would be irresponsible not to invest in monitoring the Washington Post (and the NY Times, and others') "secure" tip line. If I ran an unscrupulous business, it would be worth it, if only for the information relevant to the stock market. EDIT: Also, the information can change the course of elections and be a target of unscrupulous politicians.
I find it hard to believe that the Washington Post or any news organization has the resources to protect assets that valuable.
[edit] Nerdier link with exploit demo: http://resources.infosecinstitute.com/fbi-tor-exploit/
http://www.theguardian.com/technology/2014/jun/05/guardian-l...
I think this is a great concept, yet perhaps too little, too late (Journalists should know PGP and drop boxes like these should have been common already). I also worry a bit because of Washington Post's track record with leaks, of the top of my head:
- Washington Post was Snowden's first choice, but they put up enough demands for Snowden to move to The Guardian. [1]
- Washington Post, according to Assange, had access to the "Collateral Murder" video a whole year before WikiLeaks published their edited video. [2]
- Washington Post employs op-ed columnists that call for assassination of "criminally dangerous" leakers like Assange [3]
[1] http://nymag.com/daily/intelligencer/2013/06/nsa-leaker-shop... [2] http://www.abc.net.au/foreign/content/2010/s3040234.htm [3] http://www.washingtonpost.com/wp-dyn/content/article/2010/08...
EDIT: More information on SecureDrop: https://pressfreedomfoundation.org/securedrop and source here: https://github.com/freedomofpress/securedrop
We are continuing to discuss and debate this trade-off. Other ideas welcome!
I don't know what they're like, but if you take a list of 5000 common words and use 4 random entries for each codename, there are 625,000,000,000,000 possible combinations. Brute-forcing the entire space at 100,000 tries per second would take ~200 years.
Edit: I made a toy jsfiddle version: http://jsfiddle.net/SwWZ9/10/
The wordlist is just a random sampling of English nouns (I couldn't find a quick source of common nouns long enough). It may contain profanity, watch out!
There are several exploits which have been used in the past to expose Tor hidden services, and several papers on theoretical ways to expose them. Many of these attacks can be used in reverse to expose the origin of a connection to a hidden service.
In the [not so] extreme case, the govt can always issue a National Security Letter to WaPo and scoop up any data it wants directly from the hidden service servers, similar to its Silk Road and Freedom Hosting takedowns.
The FBI TOR Exploit [ http://resources.infosecinstitute.com/fbi-tor-exploit/ ]
Heartbleed used to reveal Tor hidden services [ https://blog.torproject.org/blog/openssl-bug-cve-2014-0160/ ]
Hot or Not: Revealing hidden services by their clock skew [ http://www.cl.cam.ac.uk/~sjm217/papers/ccs06hotornot.pdf ]
Tor Hidden Service Passive De-Cloaking [ http://blog.whitehatsec.com/tor-hidden-service-passive-de-cl... ]
One would have to assume that all the traffic going to the server is logged by the NSA and anyone else who can manage it. If the traffic volume is low then timing correlation with even a large pool of suspects is simple. An active attacker can differentiate between the SSL connection from a web browser and one from a tor node, so the background SSL traffic to the Post would not provide cover.
I think it could be improved by using a mix network (eg mixminion) accessed over tor, rather than just tor.
Unfortunately the mixmaster/mixminion networks are currently too small to provide meaningful complexity. Large scale adoption by, eg, newspapers, is not technically hard and would significantly complicate the adversary problem.
I'd love to see more discussion of bitmessage and Pond (https://pond.imperialviolet.org/)
Assuming you were able to avoid the "JavaScript crypto problem", would this be a good or bad idea?
No need for something as heavy as what you propose.
http://www.dhs.gov/xlibrary/assets/ns_tic.pdf
It's a national smartcard identity program that Obama admin has been pushing for a while.
If anything, the real concern here is the implicit encouragement to use local library computers, which would be much easier for a government agency (or cybercriminal) to infect with malware and observe.
The explicit encouragement that is clearly written on the landing page is to use a personal computer (not a work computer) and a public network (e.g. a coffee shop).