I Am Releasing Ten Million Passwords (opens in new tab)

(xato.net)

594 pointsm8urn11y ago216 comments

216 comments

Barrett Brown was not convicted merely for linking to data on the web. He was convicted for three separate offenses:

1. Acting as a go-between for (presumably Jeremy Hammond) the Stratfor hacker and Stratfor itself, Brown misled Stratfor in order to throw the scent off Hammond. Having intimate knowledge of a crime doesn't make one automatically liable for that crime, but does put them in a precarious legal position if they do anything to assist the perpetrators.

2. During the execution of a search warrant, Brown helped hide a laptop. Early in the trial, in advancing the legal theory that hiding evidence is permissible so long as that evidence remains theoretically findable in the scope of the search warrant, Brown admitted to doing exactly that, and that's a crime for the same reason that it's a crime when big companies delete email after being subpoenaed.

3. Brown threatened a named FBI agent and that agent's children on Twitter and in Youtube videos.

The offense tied to Brown's "linking" was dismissed.

Brown's sentence was unjust, but it wasn't unjust because he was wrongly convicted by a trigger-happy DOJ; rather, he got an outlandish sentence because he managed to stipulate a huge dollar figure for the economic damage caused by the Stratfor hack, which he became a party to when he helped Hammond.

m8urnOP11y ago

The trafficking charges were dropped but he still was charged as an accessory after the fact. http://cryptome.org/2015/01/brown-105.pdf

tptacek11y ago

Yes; that's #1 in my list. Thanks for the link to the sentencing memo!

dmix11y ago

I never followed the case, could someone clarify how he was an accessory after the fact?

Did they explain how he misled Stratfor? Were they investigating their own breach and contacted him somehow? Or did he hide evidence?

It'd be great to have clarity on his wrongdoing related to the hacking. The parts about threats and hiding evidence seem tertiary to peoples defense of him. Since the major crime that he became famous for was the hacking by anonymous.

1 more reply

jbapple11y ago

What were the threats against the agent and the agent's children? I'm asking because I read some of them ("ruin his life", "look into" his kids), but I'm not sure which of those are protected under the First Amendment.

Broad categories of rude speech are protected under the First Amendment, including things like, IIRC:

1. Saying if President Johnson makes you pick up a gun, he'll be the first in your rifle sight. (Watts v. United States)

2. Telling a cop "I'll kill you, you white devil" while you are in handcuffs and unable to kill him. (? v. ?)

3. Swearing "revengeance" upon the Jews. (Brandenburg v. Ohio)

jbapple11y ago

It was "White son of a bitch, I'll kill you", and it was Gooding v. Wilson.

eurleif11y ago

And as far as I can tell, it wasn't that what he said was constitutionally protected. It's that the statute he was charged under was unconstitutionally broad, because it prohibited "abusive language" in general. A more specific statute, prohibiting only threats, would have likely been ruled constitutional.

1 more reply

downandout11y ago

>The offense tied to Brown's "linking" was dismissed

This masks the scary reality that someone was indicted, arrested, and prosecuted for posting a link (not to mention that it was dismissed as part of a plea - not for lack of legal merit). While in this case there were other charges as well, there didn't have to be - all of the same pre-trial horrors (including possible detention without bail) could have occurred with only that charge. The fact that such a charge may eventually be dismissed/beaten at trial after your life is burnt to the ground for posting a link is little comfort.

tptacek11y ago

That's also a misleading way of framing the issue. Brown wasn't charged with "criminal linking" (an offense that does not exist). He was charged with deliberately and knowingly assisting in the breach of Stratfor, and subsequent maximization of the damage from that breach. And remember, he was convicted of doing that; they just pursued a different vector for it than the link. Keep in mind also, they didn't just work back from people who posted links. Hector Monsegur ratted Brown out.

Most criminal statutes look insane if you ignore the mens rea component and consider only the actus reus.

Probably the right way to address your comment is to acknowledge the sentiment behind it. It would be ominous if prosecutors trawled the Internet looking for the wrong kinds of links --- people RT'ing updates from Anonymous, for instance, or relaying already-public newsworthy facts from breaches --- and fit accessory liability cases around those innocuous acts. It is worth being wary about prosecutors doing that, because computer crime laws are poorly rigged and set up terrible incentive systems for prosecutors.

It's just that those concerns are not yet vindicated by the Brown case.

downandout11y ago

While "criminal linking" doesn't exist as a standalone crime, prosecutors have essentially tried to make it exist via other statutes. I don't know the disposition of the case, but a man in the UK was ordered a few years ago to be extradited to the US to stand trial for criminal copyright infringement after operating a site that offered links to copyrighted sports broadcasts [1]. In the Brown case, they tried to use the conspiracy statutes.

In both of the above examples, while not charged with "criminal linking," the actual conduct was linking to something prosecutors didn't like. The loud and clear message they are sending is "link to things we don't like, and we'll find a way to get you". That will have a chilling effect on free speech.

[1] http://www.theguardian.com/law/2012/jan/13/piracy-student-lo...

1 more reply

pdabbadabba11y ago

This assumes, though, that he would have been put through everything you describe even if he had only shared a link. But, as described in detail above, this was only a small piece of the government's case. I seriously doubt the government would ever have brought charges if all it had was the posting of a link.

We should also think a little bit harder, I think, about whether posting a link is never criminal. It seems to me that if someone posts a link to intentionally further a criminal conspiracy, it seems like it could plainly, and unproblematically be criminal. Accomplice liability in particular makes lots of other things, that would otherwise be innocent, into crimes when they are done with the wrong sort of intent.

downandout11y ago

> I seriously doubt the government would ever have brought charges if all it had was the posting of a link.

If it can be included as a charge on an indictment, it can be the one and only charge in it as well.

> We should also think a little bit harder, I think, about whether posting a link is never criminal.

No, we shouldn't. Linking to and/or writing about anything (absent actual participation in a conspiracy) isn't a crime in a country protected by the right to free speech.

2 more replies

Slartibreakfast11y ago

I don't know, sounds like he got off pretty lightly considering he threatened an FBI agent's children. I would expect the jail time would be a lot higher, but I guess I don't know what guides the court's decisions in these kinds of cases. I suppose five is enough time for him to figure out the error of his ways.

tptacek11y ago

His sentence was dominated by the accessory charge, and the threats don't seem to have been a factor at all.

dsrguru11y ago

The threats actually accounted for 48 of the 63 months according to the EFF article that the OP linked to.

https://www.eff.org/deeplinks/2015/01/eff-statement-barrett-...

1 more reply

egocodedinsol11y ago

but what do you think about the big picture?

I don't know much of the specifics about Brown, but I think the wider point is worth discussing, especially with respect to the proposed change in legislation.

sarciszewski11y ago

> Barrett Brown was not convicted merely for linking to data on the web.

From the article:

     Most of us expected that those charges would be dropped and some were, although they still influenced his sentence.

I want to be generous and say that the author meant what you said. The linking was not something Brown was charged with, but it was brought up during the sentencing and probably influenced the length of his prison sentence.

So while you're correct that Brown was not charged with linking to information, it's worth noting that this was still used against him anyway.

Also, people who think the linking to hacked data was the only thing that got him arrested are being disingenuous (or are simply ignorant).

tptacek11y ago

I'm not seeing where the linking was used to enhance his accessory conviction. Is there a source for that?

higherpurpose11y ago

It's interesting that you say his sentence was "unjust" given that you always seem to defend crazy sentences as "not being the real ones anyway".

Also those three sound like incredibly weak charges, and yet you somehow defend the prosecution over them.

tptacek11y ago

Is it because I say his sentence was unjust given that me always seem to defend crazy sentences as not being the real ones anyway that you came to me?

Earlier you said I say his sentence was unjust given that me always seem to defend crazy sentences as not being the real ones anyway?

Maybe your life has something to do with this.

LeoPanthera11y ago

Fun!

    $ export LC_ALL='C'
    $ awk '{ print $2 }' 10-million-combos.txt | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | head -n 20
    55893 123456
    20785 password
    13582 12345678
    13230 qwerty
    11696 123456789
    10938 12345
    6432 1234
    5682 111111
    4796 1234567
    4191 dragon
    3845 123123
    3734 baseball
    3664 abc123
    3655 football
    3330 monkey
    3206 letmein
    3136 shadow
    3126 master
    3050 696969
    3002 michael

Edit: I used Wordle[1] to make a wordcloud of the top 1000 passwords: http://i.imgur.com/FImcPiG.png

[1]: http://www.wordle.net

dvdhsu11y ago

Cool! I found the usernames interesting as well, since not many studies have been done on them. "dragon" is both a common username and password! In reply to another child post: the enormous number of "michael" passwords probably has to do with the smaller, but still large, number of "michael" usernames.

I'd run some more commands, to find out how many "michael"s use "michael" as their password, but I've got to head out now. Would be interesting -- anybody up for it?

(Ooh -- you could even juxtapose the usernames against common American names by decade [1], and probably derive some data about the ages of these users as well!)

(Furthermore -- what if we started keeping track of most common passwords by decade? That could be super interesting! I wonder if it's changed much!)

  $ export LC_ALL='C'
  $ 0-million-combos.txt | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | head -n 20 3044 infouniq -c | sort -nr | head -n 20
  2119 admin
  1323 michael
  1113 robert
  1095 2000
  1049 john
  1041 david
  967 null
  940 richard
  922 thomas
  901 chris
  866 mike
  843 steve
  832 dave
  816 daniel
  812 andrew
  797 george
  765 james
  735 mark
  730 dragon

1. http://www.ssa.gov/oact/babynames/decades/names1980s.html

jcm131711y ago

For some reason I seem to be getting different values then you. However from what I got, there was only a single instance of a username 'michael' having a password 'michael'.

HOWEVER, of all of the people whose password is 'michael' 83 seem to CONTAIN the str 'michael'.

Of the set of usernames 'michael' there are 20 whose passwords contain the string 'michael'

Of the set of usernames containing the string 'michael' there are 276 passwords that contain the string 'michael'

I honestly expected much more.

userbinator11y ago

In other words, supposing that this data is representative of most peoples' password practices, just trying these 20 passwords gives you a ~18% success rate for any username.

And... dragon. That's an unusual password to make the top-10 list. I think this might be a somewhat skewed sampling.

yoha11y ago

You forgot a zero:

   >>> (55893+20785+13582+13230+11696+10938+6432+5682+4796+4191+3845+3734+3664+3655+3330+3206+3136+3126+3050+3002) / 1e7
   0.0180973

That is, 1.8%. This is confirmed by http://maxmcd.com/passwords.html.

pavel_lishin11y ago

> supposing that this data is representative of most peoples' password practices

That might not be the case; not all passwords are created equal.

As an example, my password to some goofy online game that requires registration is nowhere near as strong as the password required to log into my work email account - for some things, I prioritize being able to type a password in quickly on a mobile device over the danger of someone breaking in and playing a low-scoring word in online scrabble.

crisnoble11y ago

It makes no sense to me, but I do recall a middle school phase where I used either "dragon" or "drag0n" for my passwords. I didn't particularly even like dragons and I don't recall ever hearing others use it, so it really catches me by surprise. Whenever I see it in a top passwords list I am filled with memories of after school library trips.

MarkMc11y ago

For sensitive sites, my preferred solution to this problem is to add a sequence of random characters to the User ID field. The user would then authenticate with something like this:

  User ID: John-CPE4E38J
  Password: snoopy

For extra security the code would then move the random characters to the password so the authentication library would see this:

  User ID: John
  Password: snoopy-CPE4E38J

In this way even an attacker who gains full access to the server database would be unable to read the passwords (assuming they have been hashed well).

Also, the User ID can be stored in a cookie so that the User ID field on screen is pre-populated and the user only has to type "John-CPE4E38J" when he switches to a new computer.

More details here: http://security.stackexchange.com/questions/80352/is-it-a-ba...

IgorPartola11y ago

This is a horrible practice. You are trying to implement two factor auth, but with a static second factor that will not be considered private by most users. It is a huge burden on them to remember, and is providing you with dubious security at best, and actually providing a vector of attack at worst. Please don't do this.

1 more reply

handsomeransoms11y ago

Are you generating the User ID with the additional characters and expecting the user to remember/keep track of it? I do think that is very user-friendly, even with the cookie trick you describe.

It seems like you are trying to force your user to remember a salt. Why not just use a proper salt and a strong password hashing function?

Also note that this protection is only useful in the case where an attacker can get a database dump but cannot perform an active attack on the server.

On the other hand, I have seen some sites (gandi.net comes to mind) do something similar to this. Wonder if they have a similar security reasoning?

1 more reply

thaumaturgy11y ago

From a user experience standpoint, this is a bit of a nuisance. Users are already having real difficulty remembering all of their different usernames and passwords for different things. A password manager is still an alien thing to a lot of people. A lot of people still have a little text file somewhere, or they rely on messages stored deep in their mailboxes somewhere, or they have a little piece of paper they try desperately not to lose...

You're right that their browser auto-complete will usually take care of it, but once it doesn't (because they switched browsers, because they got a new computer, because it got infected with malware and they took it to a wipe-and-reinstall shop), I'd expect a significant number of your users to fall back to just doing a password reset, which is a hassle.

From a security standpoint, I'm not sure what problem you're trying to solve. I get that you want to strengthen your users' passwords, but what is the specific scenario you're imagining where this is the best prevention? If you're concerned about someone brute-forcing user accounts from the outside, just make sure you have some sane throttling code. If you're concerned about someone stealing your database and breaking user passwords, just make sure you're using a robust password storage mechanism (blah blah bcrypt scrypt etc. etc.) and the usual other internet-facing application best practices (parameterized queries for example). If you're still feeling paranoid about that situation, then probably your server code could add some value to each password without doing any harm, I dunno. If someone gets sufficient access to your server to get your database and your code, game's over anyway. If you're concerned about your user having their credentials compromised elsewhere and that being used to access their account, do the same thing that many banks, Linode, and other services do: maintain IP white, grey, and black lists, and send a challenge/response to the user by text or email if the IP is on a grey list (in addition to checking for their login cookie first).

Your approach is different, but I don't understand it yet. :-)

1 more reply

chias11y ago

Is this materially different from requiring the user to have some random characters in the password, but for some reason making them type these characters into the username field where it'll be cached by the browser's autocomplete feature?

It seems like this is an amusing enough hack to do on non-sensitive sites, but I wouldn't do this on anything "real". When it comes to authentication, "hey I had this really neat idea" is almost always an immediate precursor to making things worse.

1 more reply

pgwhalen11y ago

It makes equally little sense to me, but "dragon" is routinely high on top password lists.

burkaman11y ago

I think it's probably just a common thought process. I'll pick an animal -> dragons are the coolest animal -> nobody will ever guess dragon, this is way better than using my dog's name.

Have you ever seen those online riddle things that say pick a color, pick a tool, wow I bet you picked a red hammer! We all grow in relatively similar societies, we all have relatively similar ways of thinking.

1 more reply

jessaustin11y ago

That many people have noted the "dragon" phenomenon as strange, but we don't yet have an explanation, is perhaps stranger yet. In early days, one could have hypothesized that some basic "how to use passwords" resource had offered "dragon" as an example of a password, but after two decades of internet it seems unlikely that something like that could have had such a large effect.

2 more replies

CamperBob211y ago

So is "jesus", and that doesn't seem to be true here. I find this list highly dubious, compared to others I've seen (and, long ago, obtained myself.)

m8urnOP11y ago

And it has been for 20 years

WillNotDownvote11y ago

Computers are magic. Dragons are magic. QED.

I'm actually kinda serious.

Also, humans are monkeys. Ergo, "monkey" is popular.

1 more reply

maxmcd11y ago

https://github.com/maxmcd/pwd-guess

libria11y ago

I'm surprised (disappointed?) only 1 person used "correcthorsebatterystaple".

pthreads11y ago

That is terrible, he/she used the same phrase as in the example!

300bps11y ago

Don't read too much into this. My main email account is in the original list that was posted in October of 2014. My account that is listed is myname@gmail.com. The password though is not the password to myname@gmail.com but rather to my "junk" site password.

For almost any site I have an account, I use a strong, unique password. For sites that I don't care about at all AND that I suspect have security problems I use a standard common insecure password. It is that common insecure password that is paired with my gmail account.

vacri11y ago

Looks like if you know someone called Michael, chances are that you need to talk to him and his loved ones about password hygiene...

WalterBright11y ago

My name isn't Michael, but I use the password 'michael' all the time.

Edit: oh, crud

num11y ago

  10938 12345

That's the same combination I have on my luggage!

benbristow11y ago

Heh. I use 'password' for when I'm purposely trying to make things unsecure (Like being nice and sharing my unlimited data via my phone's wifi hotspot on public transport).

stinos11y ago

So this dataset seems to be limited to english speaking qwerty using users, i.e. US only I guess?

lurkinggrue11y ago

Cool! My password hunter2 wasn't at the top of the list!

cfrs11y ago

here is the top 48K for lazy ones http://ix.io/ggh

meowface11y ago

I don't understand exactly why it's necessary to release usernames along with the passwords, or why it's ethical to do so. Stripping the domain portion of email addresses does absolutely nothing when you can find the real email, and other accounts of the victim, by Googling the unique part of the email address.

How does tying each password to its corresponding username help with password research, and does the value gained outweigh the cost of someone using this list for malicious purposes?

I'm not saying this should be illegal, but I'm struggling to understand the intent here.

a3_nm11y ago

What about research to determine to what extent usernames with words in a certain language will tend to use passwords with words for the same language? (More generally, is there any connection between the bi- or trigram distribution on usernames and the one on passwords? In fact, do they just look the same, or could you tell given a string whether it's more likely a username or a password?)

Do usernames of people with weaker passwords have something in common? How do they differ from people with stronger passwords? In France there is a practice of picking names like "foobar42" or "foobardu42", where "foobar" is a first name and 42 a "département" (country subdivision) number, which I would associate to casual users. Here I could quantify whether people with usernames of this form tend to pick weaker passwords. Insert your favorite prejudice here about lame and skilled username patterns, and quantify how the password diversity of this group fares in comparison with others.

Is it true that the most common passwords were associated to usernames that were also common? Does username frequency correlate with password frequency? Are there more people with unique usernames or people with unique passwords?

In some countries it is customary to annotate usernames with the user's year of birth. Filtering on such usernames could give insight about the correlation between age and password quality, or identify which passwords are more or less popular given the user age. You could try to check correctness of the filter using the fact that some of those people may have used their birthdate (including the year) as a password.

If a seemingly rare password in the dataset only occurs for two distinct user names, then maybe those two user names actually correspond to the same user. Do such usernames have a low edit distance? Could you use this to learn general rules to determine, given two usernames, whether they seem to correspond to the same person?

I just gave those off the top of my head, and I'm not at all working in this field, but I'd have no trouble imagining interesting applications for this data that would not have been possible with the passwords alone.

meowface11y ago

I feel like most of those research questions could be answered if it was a "username -> password strength" mapping, in addition to a hash to study duplicate trends, rather than just "username -> password". Obviously there is no objective ranking of "password strength", but a decent approximation could be provided.

There are serious risks to having your username and password in a public list. Yes, all of these usernames and passwords were already technically publicly released, but to a lazy and ignorant script kiddie, finding or even being aware of those lists can be outside their grasp.

By aggregating everything into one list, you 1) increase the search engine visibility for all credentials, which means someone Googling the username of, say, an Internet commenter who pissed them off may find a plaintext password they could use to impact the person's life with much higher probability (I work in information security and have seen that happen on many occasions), 2) encourage script kiddies and fraudsters to spend time working through the list to find working accounts that other criminals have missed in the past decade, and 3) undo any work that paste sites like Pastebin and file sharing sites like Mediafire have done to remove copies of the database dumps. 1) may not apply if it strictly remains a torrent, but it'll probably be floating around public paste sites within a few days, which would likely mean search engine visibility for every username on it.

If even 0.01% of the users on this list have accounts compromised due to its release, then I don't think that cost justifies the research benefits relative to a more redacted version of the list.

belorn11y ago

> I feel like most of those research questions could be answered if

If the person who releases this kind of information has the foresight to know what the questions are going to be, they could provide the answers directly rather than go half-way and modify the data. It would likely be less work than trying to produce anonymized data that is both useful and secure.

What I see used in cases like this is one of two options. Either full public access, or restricted access where only a few selected get the chance to do the research. The 0.01% misuse is thus balanced to that choice, rather than the theoretical case of anonymized data.

m8urnOP11y ago

As I explained in the article I seriously doubt that any more than a tiny number of these passwords are still valid. And there is no reason for them to be, having already been widely available, indexed (and cached) by every search engine, archived at archive.org, and downloaded by thousands or tens of thousands of people. Anyone who would use this data maliciously probably already has it.

Much of this data is the same data monitored by sites like haveibeenpwned.com and a dozen others. Facebook scrapes these. Lastpass will send you alerts. The risk here is minimal; the research value is much more than you realize.

2 more replies

pbreit11y ago

All possibly interesting questions (certainly not to me) but I fail to see how they would lead to any genuine advancements in authentication.

jMyles11y ago

A list of 10 million passwords alone answers almost no questions. In fact, it's probably possible to programmatically predict, with a depressing level of accuracy, what a great deal of such a list will look like, given the already available research about the distribution of complexity, the parts of speech and numbers commonly used and in what patterns, etc.

So, the next interesting question is: given the already plaintext-available lists of usernames and passwords, just how much coverage is there in the known space? Are your passwords known? Are your users' and clients' passwords known?

This document is perfect for a true positive on the matter of needing to deprecate particular combinations of username and password, and, as an obvious corollary, presenting evidence for consultation advice about the same. (Of course, being only a sample, it doesn't say anything about a true negative.)

yalogin11y ago

Before I go into the research aspect of it, there is no reason to hide the usernames from the passwords. They are already out there. The bad guys have them. So why not release them so that every one can look at them?

Also I am sure there are some research aspects to the usernames. At the very least behavioral deductions that can be drawn based on these combinations.

detaro11y ago

Probably to find out how many people do stuff like type their username backwards as a password/what kind of patterns they use. If that is useful enough information to warrant publishing data like this is debatable, yes.

exogen11y ago

Also interesting, how features of a username might correlate with password strength. Who do you think uses a stronger password, someone with the username "carguy551978" or someone with the username "w1ntermute"?

swatow11y ago

carguy followed by the 24'th n such that 1 + n + n^13 is prime, followed by the 34'th such n? I would expect a very, very strong password from someone who picks their username like that.

(see https://oeis.org/search?q=__%2C+551%2C+__%2C+978&sort=&langu...)

1 more reply

diminoten11y ago

I dunno if he should have said "released", because he's not releasing any new data. Everything he's posted is already available to anyone with a search engine and a bit of curiosity.

So if you're concerned that information which wasn't previously public is now public, you can be at ease -- all of this data was not only public already, but less "cleaned up".

hasenj11y ago

I'm curios to see if any of my accounts/passwords have been compromised

chrisan11y ago

Wouldn't be surprised if one of these sites already has it

https://breachalarm.com/ https://haveibeenpwned.com/

The author does not seem like the type of person who did the hacking himself to obtain these, but rather curated leaks into his database

hasenj11y ago

exactly why I'm curios. haveibeenpawned listed a username I often use as being pwned in a "battlefield heroes" leak, but I couldn't find the "release" for it.

presumeaway11y ago

> I'm struggling to understand the intent here.

A desire for a particular type of attention his ego seems to need.

Which, combined with either a moronic lack of appreciation for the hassle and damage he's going to cause to end-users who've already been hosed once before, or an arrogance that makes him not care, makes him difficult to fit for a white hat.

FTA:

> This is completely absurd that I have to write an entire article justifying the release of this data out of fear of prosecution

What's absurd is his assumption that stripping domain names is somehow sufficient.

Edit: I'm getting downvoted like crazy here. Which is fine, but people seem to think it's ad hominem because I'm narrowing the reasons behind why someone would release a data set with a considerable price of collateral damage attached to it, while doing very little to mitigate that damage.

Just because the likely options for why someone would do such a thing don't speak favorably of the person, doesn't make it ad hominem. An ad hominem attack is seeking to undermine someone's argument by attacking their character.

I'm saying Mark Burnett made it difficult to assume good things about him after a stunt like that. If he actually made a real argument that what he did was sufficient, or that the harm he's going to cause is more than offset by the greater good it'll do (or some such argument), then we'd have something to try to undermine (whether legitimately or fallaciously), but as it stands, he hasn't even justified his actions.

totony11y ago

>Ad hominem + ad hominem

Research requires data. If I want to do research on how best to implement my bank system, I would like to know what passwords are more likely to be contained in a dictionary attack. Usernames may have a high correlation with passwords and thus are useful. Considering all of these passwords can be obtained from obscure forums/websites and that the website where the IDs are used are not specified, I don't see why he could not release it to the public for researchers to use.

presumeaway11y ago

> Research requires data.

There's a lot of research that could be performed if we were willing to generate data without due regard for the inherent downsides.

Saying research requires data is just insufficient justification in this case.

> I don't see why he could not release it to the public for researchers to use.

Because the collateral damage doesn't justify it. That aspect of it seems to be little more than a side note to him.

He could quietly and securely give the data to established researchers.

Or, he could very publicly release a torrent for everyone's use, with almost no concern for how it'll be used.

There's a massive difference there and the likely potential reasons behind his decision to do the latter leave very little room for one to make favorable judgements about either his motives, or his ability to responsibly mitigating risk.

I'm sorry if you believe any of that to be ad hominem, but it just isn't.

> Usernames may have a high correlation with passwords and thus are useful.

And that's precisely why the likelihood of collateral damage stemming directly from his actions is much higher than it should reasonably be in this instance.

At some point what you're giving up to further research isn't worth the tradeoff. He's selling innocent bystanders up the river to further his own cause, with little evidence that he's done everything possible to limit collateral damage.

I don't understand why this line of thinking is a hard sell here.

When a government or corporation releases lightly-redacted, personally-identifying information about people, the outcry is (rightly) massive. White knight does it and, well, to question his motives is ad hominem?

Really?

1 more reply

zaroth11y ago

There is an annual 'Passwords' conference [1], which I attended in 2012, and was blown away by quite how much researchers are able to do with these password lists.

Unfortunately, I was equally impressed with what attackers are able to do with them as well. An important point is that attackers tend to have better lists, because they are the ones stealing and cracking them, and these lists make them increasingly better at cracking passwords. Defenders use the lists for all sorts of analysis on how exactly users pick passwords.

For example, "complex password policies" have become increasingly popular. But do they actually increase the entropy of the chosen passwords? Surprisingly little, since users will "defeat" the policy by applying easy to guess "munging rules". Humans being human and such. The thieves have the lists, and learn to apply the munging rules and defeat the policies. Researchers need these lists so they can discover the same weakness and try to react.

More recent research looks at things like how effective the password strength indicators are at actually helping users choose stronger passwords. We also learn about how users choose different strength passwords based on the sites they visit and such. This is absolutely fertile ground for research which can improve how we perform authentication.

Yet another good use of the lists is in defending against online attacks. E.g. Failed attempts that follow the general probability distribution of the lists are easier to identify as bots.

[1] - I think all the talks are posted, although I'm not sure there's a central archive, each conference is identified as Passwords^[Year], e.g. Passwords^14 https://passwordscon.org/

meric11y ago

These lists were released by attackers in the first place. Attackers are always going to have the lists, and the only choice defenders can take is whether to use and distribute to the defender community, or not.

pbreit11y ago

I'd be curious at what researchers were able to do with such a list (genuine, practical advances). It doesn't strike as particularly useful.

zaroth11y ago

Not a bad place to start: http://passwords14.item.ntnu.no/program.php

dj-wonk11y ago

Forgive me for doing so, but allow me to ask some possibly ignorant questions and perhaps play the devil's advocate for a moment. What about this release will help? What are the compelling research problems in the space?

We know users pick bad passwords. It seems to me the most compelling "problem" is hardly a research question -- isn't it about finding ways to encourage users pick strong passwords, not share them between sites, and not put them on sticky notes on their monitors.

Ok, putting my charitable hat again... My best guess is that researchers would like some idea about how long it takes to crack some percentage of accounts; e.g. with rainbow tables or other techniques?

The author mentioned "Analysis of usernames with passwords is an area that has been greatly neglected and can provide as much insight as studying passwords alone." What directions might a researcher take this?

m8urnOP11y ago

The main reason I have always included usernames and passwords in my research is because it allows me to analyze frequency data across multiple sites. Although I could have anonymized the usernames, I thought it would be best to keep them in. There is good value there. For example, there is quite a bit of overlap between usernames and passwords. Also, how many users include all or part of their usernames in their passwords. Plus, what usernames might hackers be most likely to try out?

The main goal here is to put the data out there and let other researchers find the value in it.

pbreit11y ago

So how would you utilize such knowledge in the real world?

tfinniga11y ago

You could use it to create a password strength meter for your website, and enforce a certain strength.

Let's say it is common to include a subset of the username in passwords. Doing so would decrease the password strength and be disallowed.

Also, you could look at certain usernames and compute likelihood of certain dictionary words, and disallow them. For example, a user named Bob might be unlikely to use spanish words in a password, but a user named Jose might be more likely.

Being aware of methods/info used by crackers when designing secure systems will lead to stronger systems.

1 more reply

hyperion201011y ago

The main issue is that attackers already have this data. They have a giant head start when when guessing passwords because just by looking at the username they can vastly reduce the search space. Whitehats and the public need to know how blackhats are reducing that search space. By making good faith publication and research on passwords risky (legally unattractive) we actively weaken security. I find it amusing that people find sharing password/username pairs questionable yet we don't seem to hold companies accountable when they loose millions of the things at once. Talk about a double standard. (RE: companies have lawyers and the little guy can get fucked for all anyone cares)

stevecalifornia11y ago

When I first got on the Internet in 1994 I used the same password for everything for the next decade before I became security conscious (now I have a random, strong, unique password for every service).

Anyways, that password is not in this list. I have found it in other password dumps before. So, I don't know what to think.

nostromo11y ago

This isn't a comprehensive list of all leaked passwords. It's a random subset of 10 million for research purposes.

6t6t611y ago

I don't think it is necessary to have one password for every single system, but three or fours tiers of passwords.

And just keep in mind that there's one password to "rule them all". That is the password for the primary mail account. I use 2-factor authentication for that.

scintill7611y ago

> three or fours tiers of passwords

Can you elaborate? My first thought is tiered by category of the service. No, I don't want my financial institutions to all have the same password, even if it's from the most secure tier.

pdenya11y ago

Sites require you to sign up but it won't matter much if someone gains access to your account on them. Those might as well share a password. Same with sites that share trust buckets like [goodreads, yelp], [facebook, twitter] etc.

In the real world though just memorize separate bank and email passes and use a password manager w/generated passwords for everything else.

Buge11y ago

This is 10 million out of 1 billion that he has.

So there is only a 1% chance of a leaked account getting in this list.

totony11y ago

From the law quoted in the article, wouldn't it be illegal to simply make a course about computer security?

The teacher willfully (and knowingly) teaches the student about "possible means of access to a protected computer."

Note: According to http://www.law.cornell.edu/uscode/text/18/1029 teaching is defined as trafficking information ("the term “traffic” means transfer, or otherwise dispose of, to another, or obtain control of with intent to transfer or dispose of; ")

avid811y ago

Even if this release has no implications for security, I think it may raise legitimate concerns for users' privacy. No doubt most users expect that their passwords will be known only to themselves. Many of the usernames contain real names, and many more could probably be traced to them. Ian Watkins was found to have "gloated" about his crimes in his password. With time and attention, I wonder whether such "dark secrets" could be found in this list.

1 more reply

_0vzd11y ago

Went ahead and performed a Levenshtein distance analysis from this list, and made a graph of it. Number 8 seems to be the sweet 'secure' spot that most people latch onto, though the distribution curve is interesting - or very human-like: http://pp19dd.com/2015/02/levenshtein-distance-10-million-us...

uptown11y ago

How are things like Twitter accounts hacked? Are they generally brute-forced with a list like this, or how do so many of them get compromised?

charlespwd11y ago

For the lazy:

  grep -i <password> 10-million-combos.txt

me_bx11y ago

for the paranoïd lazy

    export HISTCONTROL=ignorespace
     grep -i <password> 10-million-combos.txt

(type a space before the command for it not to be logged in the history)

pavel_lishin11y ago

Just because you're paranoid doesn't mean the eyes above your ï aren't watching. Although, it might mean you're delusional.

flavor811y ago

And then history -c

vacri11y ago

... which will clear your entire history, which you probably don't want.

I don't know a shorter way, but to delete one line from history, do 'history', which shows the line numbers, then 'history -d LINE_NUM'.

Or, in bash, prepend the command with a space and it won't go into history.

akerl_11y ago

Open new terminal -> unset HISTFILE -> do your greping -> close terminal

fletchowns11y ago

Unless somebody did a ps while your grep was running...

Don't put sensitive stuff in CLI args!

loqi11y ago

Good point, how about

    grep -f - 10-million-combos.txt
    <password>
    ^D^D

brianshaler11y ago

Depending on your system and configuration, couldn't you prepend a space to the command to prevent it from being saved into your history?

edit: Looks like vacri mentioned this in a peer comment an hour ago. Whoops!

nmjohn11y ago

That works if you are using bash, but if you are, for example, using zsh, you would first have to run "setopt histignorespace" which would enable hiding lines prepended with a space in the history (it's off by default).

tarblog11y ago

For the lazier, -i means case insensitive.

failed_ideas11y ago

This is great, but if you use a password manager, it's very difficult to determine which, if any, of your accounts would be compromised. For myself, this would just be doing a dump and looping a few greps. But for family and friends, does anyone have any ideas for a less technical audience?

jpatokal11y ago

If you're using a password manager and thus -- I hope -- using a different password for every service, it doesn't really matter if one service gets compromised. The compromised service in question will (hopefully) force password resets for all affected users, and the compromised password is useless elsewhere.

saraid21611y ago

Instead of responding to breaches, I would recommend an annual (more frequent is better, obviously, but I think annual is fine) cycle of rotating passwords. Just pick a day and spend it replacing passwords. As a side effect, you get a mental update on exactly what identities you're managing and whether or not you want to modify or close them.

This should be fairly straightforward even for non-technical people, if they've got a grasp on actually using the password manager itself. The hard part is (1) getting the list of identities, which isn't too hard if you're hand-holding, and (2) actually remembering to do it. (Which is why annual is nice. You can peg it to a holiday you already celebrate, or substitute it for one you don't. Halloween, for instance, because breaches are scary? Or something.)

Bonus: if a breach happens that actually feels scary, just do the rotation ritual ahead of time. Not that big of a deal.

querulous11y ago

1password has a limited ability to warn you of compromised passwords. they maintain a database of breaches that they warn you about in their client. the warning, however, is much less prominent than it probably should be

yeukhon11y ago

http://security.stackexchange.com/questions/46625/is-it-lega...

I thought of exactly the same. I was motivated by the password strength meter out there. How can you actually tell a password is strong or not or whether a password is known to attacker or not if you can ask (I was thinking along the line of private information retrieval) privately and get a probability rather than a yes/no based on all the known stolen credential out in the Internet (there are many Gbs files you can download)...

jammycakes11y ago

Just a thought here. As far as I can tell, many bona fide security researchers seem to be independent consultants. Would they be less at risk of prosecution if they were handling sensitive data such as user names and passwords under the coverage of universities and/or similar accredited institutions operating under protocols as to who can and cannot access the data?

It would probably be more security theatre than actual security, but I'd imagine that it would at least keep the FBI happy.

hueving11y ago

I wish there was an origin with these. A username/password combo I use on a ton of sites I don't care about is on here. It would be nice to know which is one leaked it.

srcole11y ago

What sorts of analyses are you guys planning? Maybe: -clustering of passwords. are aspects of the username biased towards certain clusters? -distribution of alphanumeric characters at each position of a password (e.g. 1 is a disproportionately common final character) -differences in password strength between usernames with male and female names

camhenlin11y ago

Man, I hope my password isn't in there.

cwarrior11y ago

What's your password? I could check the file to see if it's there. I found one of mine. Does anybody know from where these passwords are from?

stephentmcm11y ago

hunter2

nadaviv11y ago

For those not familiar: http://bash.org/?244321

Libbum11y ago

Oh, the memories!

m8urnOP11y ago

Public dumps mostly from the last 5 years, but some as old as ten years

totony11y ago

>What's your password?

m8urnOP11y ago

Actually three of my own passwords are on there, I left them in

untog11y ago

Read the actual article. None of this data is new:

All data currently is or was at one time generally available to anyone and discoverable via search engines in a plaintext

aceperry11y ago

My thoughts exactly. I'm amazed that I can download the file, but at least I get to see if any of my passwords are there.

chisleu11y ago

I'm not!

#successkid

tomkinstinch11y ago

To save a moment of time, here's a quick check that won't save the password string to your command history:

read -e -s -p "Password: " password && grep -i $password 10-million-combos.txt | wc -l && password=""

gayprogrammer11y ago

Woah you are REALLY optimistic about law enforcement agencies wanting to focus on real criminals.

But Barrett Brown is not the first or only example.

Aaron Swartz is the only example I need to understand what to expect from the various US law enforcement agencies.

ryanlol11y ago

Barrett Brown intentionally did everything in his power (including, but not limited to publicly threatening named FBI agents and their families) to get targeted by LE, and succeeded.

Swartz? Swartz knowingly did several obviously illegal things (breaking-and-entering?) and then acted shocked when he got charged.

His actions may have been morally defensible, but not legally. Law enforcement did their job there.

nyolfen11y ago

it's kind of hilarious that it takes a case as transparently self-serving as aaron swartz to calcify a population as privileged and inured to the justice system as programmers to go "woah hey this shit might be kind of fucked up!!!"

ternaryoperator11y ago

Aaron S. was not at all the first time programmers recognized the problem and acted on it. Perhaps it was the first time that you became aware of the issue. But there were large-scale campaigns as far back as Robert Morris's worm in 1988. Even then, programmers were rightly concerned with unfair punishment for hacking and were outspoken about their concern. Similarly, with the Randal Schwartz in 1995, and many times since.

forgotX211y ago

Could someone describe the dataset for me? Is it just two columns with one for usernames and another for passwords? Or is there any other info included? I'm on mobile right now or else I'd grab it myself.

m8urnOP11y ago

The first column is username, followed by a tab, followed by the password.

elchief11y ago

Know what encoding it's in? Postgres is choking on UTF8 and Latin1

marksomnian11y ago

Are you sure it's not choking just because it's postgres?

rinon11y ago

yep, just 2 columns

20kleagues11y ago

Way to go buddy! This research is indeed necessary and releasing such a dataset will be beneficial. Maybe it will also bring light to how outdated password based authentication really is.

levlaz11y ago

If anyone wants to check their username, I have a searchable DB up now. https://levlaz.org/passwords

sarciszewski11y ago

> He was close to Anonymous and was in fact their spokesman.

Err, no he wasn't. He just managed to get a modest amount of attention.

igonvalue11y ago

Is there an http download link that would allow downloading from the browser (or with curl)?

rplnt11y ago

There are services that will download a torrent for you. This one worked for me without registration http://www.direct-torrents.com/

bikamonki11y ago

It seems very useful for research and also practical uses, like how about a REST API with this dump? get <password> will not only return true if it exists but how common and how weak it is, or will return a false for unique. Is there such a service out there?

farresito11y ago

With all due respect, I think this is a horrible idea. Isn't it just better to simply download the dump and filter the information with the command line? Why would someone even want to write a program that connects to an API to get info like this? You don't really need to know too much to be able to filter values like those, and it's way more flexible.

akerl_11y ago

This seems a bit like testing if your parachute was packed properly by deploying it. Once I've sent my password at a 3rd party API, it doesn't much matter what the API says: my password is no longer secure.

bikamonki11y ago

Correct, but every site where you signup does that and I do not think anyone cares. Maybe such API will not be for end users but for other apps to run signup forms against it and help users choose a better one. In any case, the whole password deal is broken. I now use my own offline pwd generator for the "important" sites but I guess I am not the average Internet user.

akerl_11y ago

What site out there is sending my plaintext passwords to a 3rd party service to validate their strength?

1 more reply

benbristow11y ago

Nice idea. Working on a simple Rails API now that will return a JSON response. Will take a while to import all the passwords though.

Currently got it returning this JSON: {"found":true,"password":"test","count":117}

pbreit11y ago

How is that even slightly useful?

lumpypua11y ago

Go make it! :)

bikamonki11y ago

Hahaha sure if I did not had enough side projects of the side projects of the side projects ;)

joshmn11y ago

I'm on it.

ttty11y ago

>Many companies, such as Facebook, also monitor public data dumps to identify user accounts in their user base that may have been compromised and proactively notify users.

That is smart!

vaibhavmule11y ago

Is your password and username in that list?

kbart11y ago

Which one? I'm sure most of "hackers" here use different credentials for different purposes. Haven't found any of mine there though.

m8urnOP11y ago

Yes, three of my accounts are on the list.

Kenji11y ago

I could be relieved that my favourite password isn't in there but it's already been leaked by stupid, stupid engineers working for Riot (League of Legends video game) who stored it in plaintext and a hacker got it. It is a good practice to regularly change passwords anyways: If you're worried that your password is in there, you're doing it wrong in the first place.

ssully11y ago

You're doing it wrong if you have a favorite password. Use a password manager; there are more then a handful out there that are multiplatform and easy to set up. If that isn't your thing then there are plenty of techniques for generating unique, easy to remember passwords.

jrochkind111y ago

> As a final note, be aware that if your password is not on this list that means nothing. This is a random sampling of thousands of dumps consisting of upwards to a billion passwords. Please see the links in the article for a more thorough check to see if your password has been leaked. Or you could just google it.

ommunist11y ago

I thank the post author for releasing this data. I found one of my accounts there and changed password to a more secure one.

osw11y ago

awk '{ print $2 }' 10-million-combos.txt | grep 1234 | wc -l

only 180896 people have 1234 in their password, thought there would be more

jcm131711y ago

Hunter6 is used as a password 9 times...

pbhjpbhj11y ago

So this guy found a zero-day that works across different unzip binaries, or what ...!?

jacobsimon11y ago

_ everyone frantically searches for their own usernames _

julianpye11y ago

Everyone knows the whole email/password concept is broken. I believe that overall OAUTH is needed, but it needs a much stronger consumer facing view.

StavrosK11y ago

I'm not sure how OAuth can help. Does it allow you to choose whom to authenticate with, or does it tie you to one specific provider? I much prefer Persona, but Mozilla has abandoned it, and most resources around it are dead links. What a colossal shame.

sarciszewski11y ago

I'm personally looking forward to something like SQRL.

https://www.grc.com/sqrl/sqrl.htm

StavrosK11y ago

That's also a nice protocol, but I think it requires too many extra things (mobile phone, net connection, etc). Plus, what if your key gets stolen?

1 more reply

sarciszewski11y ago

A well-implemented OAuth implementation is wonderful. Sadly, many implementations are just crappy.

throwawaykf0511y ago

What's worse than crappy implementations is that every provider has their own version of implementation-specific crappiness that is inconsistent with everyone else's.

sandworm11y ago

It was a mistake to release this today.

Everyone knows that legally questionable moves should always be made on a friday. That allows everyone in government to cool down for a couple days. By the time the weekend is over all the news outlets have moved on to whatever war just started up. You don't want some hothead prosecutor tweeting out a threat, forcing himself to follow through later in the week. Nobody picks a fight when 15 minutes away from a weekend.

Watch the NSA/CIA/MIB admissions. They always stage their spying/torturing me culpas on friday afternoons.

naradaellis11y ago

In the show West Wing they called it taking out the trash.

j / k navigate · click thread line to collapse

216 comments

tptacek11y ago

Barrett Brown was not convicted merely for linking to data on the web. He was convicted for three separate offenses:

3. Brown threatened a named FBI agent and that agent's children on Twitter and in Youtube videos.

The offense tied to Brown's "linking" was dismissed.

m8urnOP11y ago

The trafficking charges were dropped but he still was charged as an accessory after the fact. http://cryptome.org/2015/01/brown-105.pdf

tptacek11y ago

Yes; that's #1 in my list. Thanks for the link to the sentencing memo!

dmix11y ago

I never followed the case, could someone clarify how he was an accessory after the fact?

Did they explain how he misled Stratfor? Were they investigating their own breach and contacted him somehow? Or did he hide evidence?

1 more reply

jbapple11y ago

Broad categories of rude speech are protected under the First Amendment, including things like, IIRC:

1. Saying if President Johnson makes you pick up a gun, he'll be the first in your rifle sight. (Watts v. United States)

2. Telling a cop "I'll kill you, you white devil" while you are in handcuffs and unable to kill him. (? v. ?)

3. Swearing "revengeance" upon the Jews. (Brandenburg v. Ohio)

jbapple11y ago

It was "White son of a bitch, I'll kill you", and it was Gooding v. Wilson.

eurleif11y ago

1 more reply

downandout11y ago

>The offense tied to Brown's "linking" was dismissed

tptacek11y ago

Most criminal statutes look insane if you ignore the mens rea component and consider only the actus reus.

It's just that those concerns are not yet vindicated by the Brown case.

downandout11y ago

[1] http://www.theguardian.com/law/2012/jan/13/piracy-student-lo...

1 more reply

pdabbadabba11y ago

downandout11y ago

> I seriously doubt the government would ever have brought charges if all it had was the posting of a link.

If it can be included as a charge on an indictment, it can be the one and only charge in it as well.

> We should also think a little bit harder, I think, about whether posting a link is never criminal.

No, we shouldn't. Linking to and/or writing about anything (absent actual participation in a conspiracy) isn't a crime in a country protected by the right to free speech.

2 more replies

Slartibreakfast11y ago

tptacek11y ago

His sentence was dominated by the accessory charge, and the threats don't seem to have been a factor at all.

dsrguru11y ago

The threats actually accounted for 48 of the 63 months according to the EFF article that the OP linked to.

https://www.eff.org/deeplinks/2015/01/eff-statement-barrett-...

1 more reply

egocodedinsol11y ago

but what do you think about the big picture?

I don't know much of the specifics about Brown, but I think the wider point is worth discussing, especially with respect to the proposed change in legislation.

sarciszewski11y ago

> Barrett Brown was not convicted merely for linking to data on the web.

From the article:

     Most of us expected that those charges would be dropped and some were, although they still influenced his sentence.

So while you're correct that Brown was not charged with linking to information, it's worth noting that this was still used against him anyway.

Also, people who think the linking to hacked data was the only thing that got him arrested are being disingenuous (or are simply ignorant).

tptacek11y ago

I'm not seeing where the linking was used to enhance his accessory conviction. Is there a source for that?

higherpurpose11y ago

It's interesting that you say his sentence was "unjust" given that you always seem to defend crazy sentences as "not being the real ones anyway".

Also those three sound like incredibly weak charges, and yet you somehow defend the prosecution over them.

tptacek11y ago

Is it because I say his sentence was unjust given that me always seem to defend crazy sentences as not being the real ones anyway that you came to me?

Earlier you said I say his sentence was unjust given that me always seem to defend crazy sentences as not being the real ones anyway?

Maybe your life has something to do with this.

LeoPanthera11y ago

Fun!

    $ export LC_ALL='C'
    $ awk '{ print $2 }' 10-million-combos.txt | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | head -n 20
    55893 123456
    20785 password
    13582 12345678
    13230 qwerty
    11696 123456789
    10938 12345
    6432 1234
    5682 111111
    4796 1234567
    4191 dragon
    3845 123123
    3734 baseball
    3664 abc123
    3655 football
    3330 monkey
    3206 letmein
    3136 shadow
    3126 master
    3050 696969
    3002 michael

Edit: I used Wordle[1] to make a wordcloud of the top 1000 passwords: http://i.imgur.com/FImcPiG.png

[1]: http://www.wordle.net

dvdhsu11y ago

I'd run some more commands, to find out how many "michael"s use "michael" as their password, but I've got to head out now. Would be interesting -- anybody up for it?

(Ooh -- you could even juxtapose the usernames against common American names by decade [1], and probably derive some data about the ages of these users as well!)

(Furthermore -- what if we started keeping track of most common passwords by decade? That could be super interesting! I wonder if it's changed much!)

  $ export LC_ALL='C'
  $ 0-million-combos.txt | tr 'A-Z' 'a-z' | sort | uniq -c | sort -nr | head -n 20 3044 infouniq -c | sort -nr | head -n 20
  2119 admin
  1323 michael
  1113 robert
  1095 2000
  1049 john
  1041 david
  967 null
  940 richard
  922 thomas
  901 chris
  866 mike
  843 steve
  832 dave
  816 daniel
  812 andrew
  797 george
  765 james
  735 mark
  730 dragon

1. http://www.ssa.gov/oact/babynames/decades/names1980s.html

jcm131711y ago

For some reason I seem to be getting different values then you. However from what I got, there was only a single instance of a username 'michael' having a password 'michael'.

HOWEVER, of all of the people whose password is 'michael' 83 seem to CONTAIN the str 'michael'.

Of the set of usernames 'michael' there are 20 whose passwords contain the string 'michael'

Of the set of usernames containing the string 'michael' there are 276 passwords that contain the string 'michael'

I honestly expected much more.

userbinator11y ago

In other words, supposing that this data is representative of most peoples' password practices, just trying these 20 passwords gives you a ~18% success rate for any username.

And... dragon. That's an unusual password to make the top-10 list. I think this might be a somewhat skewed sampling.

yoha11y ago

You forgot a zero:

   >>> (55893+20785+13582+13230+11696+10938+6432+5682+4796+4191+3845+3734+3664+3655+3330+3206+3136+3126+3050+3002) / 1e7
   0.0180973

That is, 1.8%. This is confirmed by http://maxmcd.com/passwords.html.

pavel_lishin11y ago

> supposing that this data is representative of most peoples' password practices

That might not be the case; not all passwords are created equal.

crisnoble11y ago

MarkMc11y ago

For sensitive sites, my preferred solution to this problem is to add a sequence of random characters to the User ID field. The user would then authenticate with something like this:

  User ID: John-CPE4E38J
  Password: snoopy

For extra security the code would then move the random characters to the password so the authentication library would see this:

  User ID: John
  Password: snoopy-CPE4E38J

In this way even an attacker who gains full access to the server database would be unable to read the passwords (assuming they have been hashed well).

Also, the User ID can be stored in a cookie so that the User ID field on screen is pre-populated and the user only has to type "John-CPE4E38J" when he switches to a new computer.

More details here: http://security.stackexchange.com/questions/80352/is-it-a-ba...

IgorPartola11y ago

1 more reply

handsomeransoms11y ago

Are you generating the User ID with the additional characters and expecting the user to remember/keep track of it? I do think that is very user-friendly, even with the cookie trick you describe.

It seems like you are trying to force your user to remember a salt. Why not just use a proper salt and a strong password hashing function?

Also note that this protection is only useful in the case where an attacker can get a database dump but cannot perform an active attack on the server.

On the other hand, I have seen some sites (gandi.net comes to mind) do something similar to this. Wonder if they have a similar security reasoning?

1 more reply

thaumaturgy11y ago

Your approach is different, but I don't understand it yet. :-)

1 more reply

chias11y ago

1 more reply

pgwhalen11y ago

It makes equally little sense to me, but "dragon" is routinely high on top password lists.

burkaman11y ago

I think it's probably just a common thought process. I'll pick an animal -> dragons are the coolest animal -> nobody will ever guess dragon, this is way better than using my dog's name.

1 more reply

jessaustin11y ago

2 more replies

CamperBob211y ago

So is "jesus", and that doesn't seem to be true here. I find this list highly dubious, compared to others I've seen (and, long ago, obtained myself.)

m8urnOP11y ago

And it has been for 20 years

WillNotDownvote11y ago

Computers are magic. Dragons are magic. QED.

I'm actually kinda serious.

Also, humans are monkeys. Ergo, "monkey" is popular.

1 more reply

maxmcd11y ago

https://github.com/maxmcd/pwd-guess

libria11y ago

I'm surprised (disappointed?) only 1 person used "correcthorsebatterystaple".

pthreads11y ago

That is terrible, he/she used the same phrase as in the example!

300bps11y ago

vacri11y ago

Looks like if you know someone called Michael, chances are that you need to talk to him and his loved ones about password hygiene...

WalterBright11y ago

My name isn't Michael, but I use the password 'michael' all the time.

Edit: oh, crud

num11y ago

  10938 12345