Password Rules Are Bullshit

oh… yeah… ok. I’ve actually done that myself. I had a client who’s
DNS server crashed and I needed to change their DNS settings, but the lady
in charge of the domain settings wasn’t available, so we couldn’t log into
the domain provider to change the DNS. I ended up going into the company’s
old intranet system, pulling her account MD5 'encrypted" password and
putting it into a variety of md5 decoders. And, surprise surprise, she
uses the same password. We got right in and changed it.

Since then, this client actually outsourced their system to some group of
independent foreign “programmers” (who seem to all be teenagers). (I told
the client of the danger and then cancelled our contact to avoid any legal
repercussions.)

1 Like

And yet your “examples” say…

dictionary file containing about 26 million words, combined with programming rules that greatly extend its effectiveness by adding numbers, punctuation, and other characters

and

It combines each word in a dictionary with every other word in the dictionary

That’s a total of two words combined and some mutations, absolutely not

polling all books in existence and many websites and forums for phrases including combinations of nonsensical words

Combining two words and mutating seems reasonable, let’s say most people have 20k words they regularly use, 20k times 20k is 400 million, but a damn far cry from “every phrase in every book ever written”, like you said.

Now this Ars article… is very interesting. It is what you described. In fact, here’s a long word list that guy cracked and it is fascinating reading. There is some potential there, but realize this guy is custom generating 1TB worth of phrases every time based on the target:

Plucking long word groupings out of books and articles and turning them into working cracking dictionaries is no trivial undertaking. For one thing, it requires huge amounts of disk space. Dustin works around the challenge mostly by filling up his 1TB hard drive with a list, using it to generate guesses against his uncracked hashes, wiping the drive clean, and starting all over with a new list of phrases.

One of the highlighted Ars commenters at the bottom of that article answered your question, too:

So is “correct horse battery staple” still the right type of password to use? It doesn’t seem like just stringing together a few dictionary words is sufficient any more. Surely putting together random, but common, dictionary words is in the cracker’s arsenal as well.

There are ~750,000 words in the english language. Even without substitutions, capitalizations, or weird spacing, that represents about 10^23 combinations if you picked 4 at random. You could test a billion combinations a second and finish sometime in the next 4 million years. But you said common words…

Average adult vocabulary is 20,000-35,000 words. Let’s assume that people who voluntarily test their vocabulary are probably on the high end of the bell curve in terms of word usage, and cut that low number in half. That leaves us with 10,000 words, and 10^16 ways to combine them (again if we picked just 4 at random to make our random passphrase). Generating a million hashes per second (pretty damn fast), it would take our cracker about 120 days to go through the combinations, and consume 284PB if he decides to store it as a lookup table. And that’s just from choosing 4 random commonly used words. If you went to 5, or did decided to capitalize the first and last letters, or the first letter of every word, or put a random space in there, or included a “word” made up from the first letters of all the other words (i.e., “correct horse battery staple chbs”)…well the numbers get astronomical very quickly.

The commenter was lowballing a hell of a lot on that hashes per second figure, though. Per the GRC haystack page:

  • Offline fast attack: 100 billion guesses per second
  • Nation state: 100 trillion guesses per second

One million guesses per second is pretty… quaint by today’s standards.

1,000,000 one million
1,000,000,000 one billion

So what he calls 120 days at that one million hashes/sec rate, let’s reduce by 1000 to 2.88 days, that seems pretty realistic on today’s hardware.

Also, consider the weight of number of hashes (passwords) you have. It seems reasonable that you’d beat about 50% of them with short passwords alone (assuming 8 char average password, which is nothing these days), common wordlists, common mutations, and a little brute force.

But those are small lists. You would go from:

  • each password has a few billion possible words + mutations

to

  • each password has hundreds of trillions of words, phrases, and mutations

So even if you had that 1TB hard drive full of custom phrases derived from… books? magazines? movies? TV? what’s the target again? you have expanded the effort of work by many many orders of magnitude.

You are thinking like a person that wrote an algorithm instead of a code breaker. I would suggest reading https://www.amazon.com/Codebreakers-Comprehensive-History-Communication-Internet/dp/0684831309. It can be dry and it’s long and very detailed but at the end one of the many lessons it teaches is that history is replete with algorithms that failed because of exactly the hubris you displayed in your response. “They’ll never crack my fancy algorithm because the sheer number of combinations…”

The missing variable in your equation is human habit. You missed the part about machine learning, leveraging past password cracks and general language structure. Humans have tendencies that vastly reduce that address space. For example, our tendency would be to use a clever phrase or something from a movie or book or website (e.g. ‘correct horse battery staple’) or to use a sentence that has structure. E.g., in ‘correct horse battery staple’, I noticed it only contains words that are 5-7 letters of which 3/4 are nouns and the first is a verb. It wouldn’t surprise me if other people using this approach showed a similar habit.

20k times 20k is 400 million, but a damn far cry from “every phrase in every book ever written”, like you said.

Again, I’ll point out that those posts are from four to five years ago. The computing power and storage capabilities are much greater. In addition, in the second link, only one year after the quoted 26MM word file, you will notice they used one that contained one billion words.

IMHO, you are swimming upstream with human generated passwords. There is simply too much computing power, too much motivation and the machine learning tools are too good to think it cannot rip your all lower case, all English word passwords in weeks if not days or hours.

However, how do we test this hypothesis? Much like new encryption algorithms, the only way to know for sure is to let a group of professional crackers that compete at black hat conferences have a go at your password algorithm. Only then will you know how secure your approach is.

While it’s an interesting discussion about password length, complexity, entropy etc. And while there are real enthusiasts out there that crack passwords as a hobby I think an important point is being missed. At the end of the day discussions around password complexity exist because of security. If I’m an attacker, I want your password for one of two main reasons. 1) I want to authenticate as you to some system. 2) I want to test for password reuse so that I can authenticate as you to some other system. I don’t particularly care how I do it. With Windows maybe I pass the hash. With web apps maybe I steal your session token, or backdoor the login page, or crack your password. As an attacker as long as I can access the objective data under your security context I have won. With this in mind I would argue that it’s more realistic to strive for increasing the difficulty of password cracking so much so that the attacker has to resort to other, “risky” attacks. By risky I mean more likely to be caught. And in forcing the attacker to take risks, the hope is that you catch them, understand the scope of the attackers access, and remove that access. Afterwards you would reset passwords. So again the hope, I would argue, is that passwords are resilient enough to not be cracked within the amount of time that that entire process takes.

Secondly, the whole discussion around computing power is relevant with unsalted MD5, but with slower algorithms this completely changes the discussion. Consider Jeff’s new post. 1,600 hashes per second on the latest and greatest hardware is pretty darn slow.

On top of using a slow & salted hash there is still your point about humans that choose bad passwords. Using a blacklist, even a small one , I think can be effective. All you really need is to discourage the top X % of weak passwords. Those would be the “low hanging fruit” passwords that an attacker would first crack. If an attacker spends 5 days cracking, with little to show for it, I am more than willing to bet they will switch to a different attack.

1 Like

Well, I just want to share my strategy, cause I didn’t see it mentioned.
I have 3 levels of passwords.

  1. unsecure - can be distributed to friends without harm, like pass to a coupon site… Noone will buy me coupons, even if they hack it. Yet my friends can benefit from a good deal without me exposing my stronger passwords. this is something quick to type and tell to others.
  2. medium - something that requires some caution, like online gambling sites, utilities, etc. I generally don’t care if these got hacked, as credit cards are not stored here, but could hurt me a little if abused.
  3. hard - email, dropbox, banking, paypal. all with 2 factor auth

This way I only have to remember 1 hard password, and even if thats stolen, i’ve got good old 2 factor auth there. Its not optimal, but human capable. So please, site hosters, don’t tell me how storng my password should be. I don’t care, if my discourse account is hacked. So what? someone will shitpost in my name. boom. I want quick access, and to be NOT forced to use my hard password, cause i will never remember 23423423 different passwords for ther 2342345 different sites I use. Let me choose my security level.

I had a bit of a sad when I realized that we were perfectly fine with users selecting a 10 character password that was literally “aaaaaaaaaa”. In my opinion, the simplest way to do this is to ensure that there are at least (x) unique characters out of (y) total characters.

It’s worth it to consider how the expected entropy will change by adding this rule. I would imagine it would remove a lot of search space right off the bat, esp. if your magic number were known. Consider: ‘aardvarkafrikaansbazaar’.

Perhaps all but the most secure sites should offer something like, “Hey,
your password kinda sucks. Just click this ‘it’s not our fault if you get
hacked’ agreement, and you can use it.”.

1 Like

You may have overlooked a rule that goes hand in hand with password length. Namely would it really make a big difference if you upped the number of tries before the system locks you out to 15 or so?

Only way I see it’s possible to sort this problem is when people decide to live right and we won’t need passwords.

Like Einstein said: Problem cannot be solved from this same level of consciousness it was created from.
I now understand what he was talking about.

We fight with reflection in a mirror and try to protect ourselves from our own shadows.

I got tipped of about your article having written something similarly myself, where instead of encouraging my users to use characters, I encourage them to use sentences. Since the vocabulary of the English language alone is roughly 150,000 words, before we start adding slang words, other languages (I am a Norwegian myself), and the fact that most people can create some simple phrases in a whole range of different languages, such as Spanish (Hasta la vista), Arabic (Allahu akbar), Yoda speak (The force in you, truly is strong my son),etc, etc, etc - I concluded with that increasing it even further, to 25 characters, while encouraging my users to use complete sentences would probably result in even larger entropy.

If you’d like to read my ramblings, feel free to check it out here

Now of course, the idea is that even a sentence with an astonishingly high amount of entropy, is still dead simple for the user to remember, without having to write it down. While at the same time, the statistical probability of that he’ll need to reuse passwords, becomes significantly reduced - At least passwords he has used previously, since these would historically for the most parts have been consisting of 8-10 character passwords.

In addition, creating a unique password for each service, would be easy since the user could use his own personal associations, such as for your site I could have chosen; “Holy mother of blip, I am so deadly scared now, that my hair stands straight up into the air”

All in all, significantly increasing the entropy, literally exploding it in size, while still making sure that the human brain is easily capable of remembering the actual password. This would also encourage users to make sure they use the “Remember me” checkbox when logging in, resulting in sending the password over the wire fewer times, arguably further increasing its security …

do you think it’s possible that you may be addressing the wrong audience?

I mean most of the reason that dumb password rules exist in the wild is because the software behind the password box allowed for the dumb rules to be defined… If the developers who wrote the rule definition software didn’t allow for dumb rules to be created in the first place, then the rule enforcers would be forced to come up with better rules.

For Example:

  • not allow the admin to drop the maximum password length below 20.
  • if the entered password matches some known good standard form, then ignore the BS rules set by the admin
    ** 128 to 256-Bit Base64: (?i)^[a-z0-9+/]{22,43}={0,3}$
    ** 128 to 256-Bit Ascii85: (?i)^[a-z0-9!#$%&()*+;-<=>?@^_`{|}~]{20,40}$
  • not allow the admin to disable copy/paste into the password box

built in password shaming may be a nice feature as well:
short password -> “My cell phone could randomly guess your password in x seconds”
password in a dictionary -> “I just used a thesaurus to guess your password in 0.00x seconds”

whoa, for the record ‘correct horse battery staple’ is not a human generated phrase it’s randomly generated by rolling dice and looking up words in a list big enough to be secure. Then taking what you rolled and writing a story to help you remember the random sample. (Unfortunately with Diceware’s word list though you need 20 words to get 258.496… bits of randomness)

Not only it the rule is bullshit. It also bullshit if you can not put the password in your brain. When you cannot remember it, you will have to write it down, or save it some where. That is the catch isn’t it!

For example, Bitlocker - I can never able to remember the 48 digit number, so I have to write it down, and bring it with me. So if I lost my laptop, I probably also lost my written keys carry with me at the same time. You don’t have to proof if Bitlocker have a back door or not. The back door is on you/user!!

1 Like

Keyboard patterns in passwords

Ok, some really good points here but I’m not James Bond. I’m old, tired, and boring. Why do I need a password at all to open up my phone voicemail and other similar low key apps? I don’t have a secret Life to hide, I don’t care if my wife checks anything for me, is all this really necessary? I get the point for banking and the like but where does it end? Will I need a password to get the toaster to work or a chip in my finger for the toilet to flush? My friends claim Big Brother is watching and listening to me through the TV too, poor bastard, hope he’s got some strong coffee.

You are so right!
I also hope that you did not use any “bullshit” algorithms either, like AES or SHA, those get attacked all the time!

Way better to make your own algo! I like “double XOR” for super protection!

Ever thought that your bank may leave a voicemail?

Mostly you need to protect your email (because it is the defacto skeleton key for “forgot password” everywhere), and bank related online accounts.

That is, unless you just hate having money, and would like to assist others in removing said money from your posession so you don’t have to be bothered by alllll that pesky cash :money_mouth_face:

1 Like

What would be the backup plan for someone whose phone ran out of juice before the “ping” arrived?

1 Like

Why are we try to make our password so so “secure” where your data still can be read from the company that storage it and can be access without any of your password?
Currently, I see that password is virtually useless by the way it been used and the system been designed. It only help prevent a “point-to-point” attack. But most large scale attack are from within the company who storage your data.

Before make your password beautiful, we need a better design system first. For example:

Google,Hotmail,Yahoo mail should encrypt your email with your password but not storing any of your password in any form in their system. Then no one can read your email except person known your password. If the new email come into the system, the company may not able to encrypt it under your password too. But at soon you login with your password, it start encrypt under your PW. So if any one in google like to obtain your email content, they have to contact you for permission. As soon you permitted, the content decrypted, and encrypted the copy with the company own password.

This way, once can implement another layer of security like block chain to trace the transaction so that if you want to claim your data been stolen, you can trace it down to the original person/company how take your data and share to other that not under your original contract of giving out your data.

You can trace the chain of decryption and encryption. Which are where data been transfer or shared to other. Then it is a truly DNA of sharing with caring. And person who share your data now have to take much more responsible on their hand under the protection of law and tractability block chain.

Another example are: Medical data should only access able by you. Then you permit your doctor to use your data for other research by him ask you the transaction, then you supply your password to decrypt, doctor supply his for encrypt. Then he now has responsibility share the right persons/org. And they can be traceable. Now you can sue the doctor if he abused your data.

1 Like