The Dirty Truth About Web Passwords

While I agree, that passwords are not best idea (I rather use OpenID authentication, but only due to my laziness) I wouldn’t agree that OpenID is a best (or even better solution).

Let’s think about current situation: we can safely assume, that the culprits didn’t download magic /etc/password file but got a hold of smaller or bigger portion of Gawker shared database. Right now they have access to (as Gawker say) 1.2 million records consisting of: username, e-mail, password (probably somewhat encrypted/hashed).

You now have this kind of DB, what now? First you need to link this data to proper identity. Most data-sensitive companies I know (and use their service) have numerical logins, which are never sent to user through internet, but usually through traditional mail, You can’t link that, unless you’ll try “password” for every client number you can imagine, good luck with that.

Same is with pretty everything else as long as someone set different password for e-mail and different password for site he is/was using - that way proper identity can’t be linked and hack is gone (maybe someone is going to post bulls**t on site you forgot you even registered on sometime in the future, but who cares).

On the other hand if someone gets access to whole tree (he doesn’t even need username/password, just linked data) then you have real problem. I shown friend of mine how his bank compromised him by asking by phone every identity confirmation question so he could log to phone service. I called back to the bank, introduces myself as him, given responses which I remembered hearing and changed password. And I am just average developer, nowhere close to expert identities thieves.

The point is that having 1.2mln of such records means you don’t care for blank connections where email passwords are empty. You care only for those you can follow and researching identities is painful and expensive process. Chances that you’ll be target of such identity mining are similar to someone picking exactly on you (which is unlikely unless someone is really hard core troll offending everyone or some kind of celebrity).

Gawker made smart move by exposing its problem. But as written in article - chances that your account will be compromised rise with every site you’ve registered on. As well as chances that you’ll never know this happened…

I do agree that a global internet identity system would be really convenient, and I am almost sure that in the future (maybe ten years) it will be like that for everybody. If the biggest companies (say Microsoft, Google, Facebook, …) get to an agreement and implement a common system, it would be quite fast.

But, let’s talk about reality. When I think about a base user, I always think about my parent, a 70 years-old man with little knowlegde about technology. Would by dad be able to use my web if I used something like Open ID, for instance? Answer: NO, it’s way too complicated.

So, for blogs like this one, it’s ok to use that kind of standard identification system, but for webs like mine, I had no other options but implement my own identification system.

As some commenters have already pointed out, this post contains incorrect information. The UNIX crypt routine computes exactly what you are advocating – salted hashes, even though it relies on symmetrical DES algorithm. Of course, that does not mean that using it is actually a good idea – because of its low computational complexity and the short 2-char salt. bcrypt is generally recommended as a viable approach these days… And at the end of the day, though, with passwords like ‘123456’ and ‘password’ chosen by most users [1], no amount of smart hashing is going to help you if the database is compromised.

As there is enough confusion about password security in the blogosphere already, and your blog enjoys fairly high visibility, it would be nice to see that point fixed in the post.

[1] http://www.duosecurity.com/blog/entry/brief_analysis_of_the_gawker_password_dump

Kind of nitpicking here, but I recently discovered (http://security.stackexchange.com/q/379/33#440) that Rainbow Tables are NOT just large lists of hashes of all possible passwords.

It would actually be more correct to call them “Hash Chains”, rather than the “Hash Tables” we are all familiar with, and most of associate with Rainbow Tables.
Yes, it seems even some of the tools call it by its wrong name…

User @Crunge on SE really opened my eyes to this one, even though I knew what Rainbow tables were - or thought I did, as it turns out I didn’t. And I am also one of those considered by others to be an “expert” (in security, though not specifically cryptography).

Seriously, it’s an awesome post, one of the best answers I’ve seen on SE - go check it out: http://security.stackexchange.com/q/379/33#440

Biometrics…'nuff said.

This is yet another example of why passwords should never be stored in a sites database (in clear text or encrypted). The only correct way to store passwords is a salted hash.

I knew this two years ago when I created www.my-msi.net. Even if a hacker were able to get a list of emails and ‘passwords’ the ‘passwords’ would be useless since more than one password can hash to the same value, there is no way (on earth or heaven) to go from a hash to clear text!

Nore do we store credit card info in the database. We let Amazon handle all credit transaction and only the cookie that Amazon returns is stored in the database.

I dare anyone to try SQL Injection on the site. It is written in ASP.Net and those controls neuter SQL!

The only problem I have with the concept of an “Internet Drivers’ License”, as you put it, is only half-covered by the centralization of risk. I understand that it means less attack points for the same password, which is better than twenty different options for a hacker to determine the thing. However, a successful attack on the site you login from does more than grant access to other sites you visit. Like a fake Drivers’ License in the real world, it can be used for access in your name to other sites you never visit as well as the ones you do, allowing criminals everything from easy defamation to - if your payment credentials are accessible from the login site - use of your money to buy just about anything from anywhere that accepts the login information.

It’s sad, but for true security, you need a unique password for every site you visit that requires one. And, yes, that requires a level of memorization beyond the capabilities of most persons. My best recommendation is a highly self-critical evaluation of risk, determining what you are willing to live with and should (someone had five tiers of password security above, for example), and implement that as best as possible. In the case of areas where low-security is fine, a common login site may be a good idea, so long as that login does not link to any payment information whatsoever. For anything involving money, use a unique password to that site and, if plausible, refuse to store payment data with the site. If you must store payment data, make sure you trust that site with your life (because you are), and use the strongest password that site will allow. If that site doesn’t allow strong passwords, don’t trust it with your life.

“123456”, seriously? An actual editor uses that effing password? OK, I use some weakish passwords for sites I don’t care too much about, but at least they have letters and numbers. For sites I really care about, I use special chars, capital/lowercase, numbers, and letters.

Seeing all these leaked editor accounts just saddens me.

At first I was kind of amused by the stupid people using 123456 but the more I thought about it, the more this really is the way to go for utterly worthless sites like lh, gawker, and other sites where someone impersonating me can really have no important effect - if EVERYONE on gawker has treated the sites and comments as more or less worthless, no one would care about this breach, because we all always use 123456 or something equally ‘stupid’ that we would never use for something actually important like a site where I purchase something.

I’m changing all my passwords on these sites to that !@#$%^ which is just shift+‘drag finger across 1-6’ and also looks a little like I’m giving them the finger.

So not only does this put all these sites on notice - our passwords are this way because you’ve shown your ability to keep my passwords and your ability to lie about how safe they are (head of gawker posted that the encryption they used was 100% unbreakable) are at odds.

Pretty soon there will be a new exploit, because gawker will require your password to be unique from others. See where that’s going?..

Way too many comments to read them all. Here’s my 2¢: I use a strong unique password for a few critical websites that I use (like my email, paypal, bank, etc.) and a common password for all those dozens upon dozens of websites where I simply don’t care if they get hacked or not.

This, I believe, is way more secure than a centralized OpenID or whatever other identity provider. For one thing anything can be hacked, for another - I really don’t want to entrust my bank account access to any 3rd party except my wife.

Oh, and I forgot to mention - a quick one-step registration for a website beats OpenID-based registration any day. I groan in pain every time I need to use my OpenID to register somewhere, and seriously reconsider, whether I want to go through all that bother. Note: I picked Verisign as my OpenID provider. Pretty much by random, though theirs is the name I trust most in security. And their process isn’t complicated or anything - it’s just way lengthier still than a typical one-step registration.

Great one.

One pain usually, is that different sites have different “password strength” definitions. Someone is looking for a 6 digit pin, someone asks for a 8-10 characters password, and another website a password with uppercase, numeric, and special characters. This makes it so hard for the user to manage all the credentials.

  • How does one know if the site is safe (storing a salted hash) and not a gawker?
  • For existing sites on the web, how does one classify the passwords to have different levels of criticality?

"This is yet another example of why passwords should never be stored in a sites database (in clear text or encrypted). The only correct way to store passwords is a salted hash.

I knew this two years ago when I created www.my-msi.net. Even if a hacker were able to get a list of emails and ‘passwords’ the ‘passwords’ would be useless since more than one password can hash to the same value, there is no way (on earth or heaven) to go from a hash to clear text!"

It wasn’t in cleartext.

Unfortunately, they were using DES with a two-character salt. The same technique that UNIX was using 30-40 years ago. The same technique that no Linux/BSD system worth its salt has used for at least 10 years because of how quickly it can be broken.

Heck, the Debian Linux machine I was using 10 years ago was using an MD5-based scheme with an 8-character salt, and even that’s no longer used by a modern Linux/BSD system because it’s too weak.

This article seems like good knowledge, in-depth analysis, and bad advice. There’s no easy solution, but this implementation would be disaster.

Gawker and J. Random aren’t necessarily the real problems (especially with the options of cascade of passwords, LastPass, KeePass, Password Safe, SuperGenPass, home-made transform using hash algorithm, etc.). Examples of real problems: Using the same password everywhere, passwords that can be reset via email or challenge questions, and sites that don’t allow strong passwords. The idea in this article would be a real big problem.

7 ways, just off the top of my head, to pwn someone with this “centralized risk:”

http://sophware.posterous.com/even-coding-horror-may-not-have-a-good-answer

I agree with the suggestion that passwords are pre-historic and need to be discarded altogether.

Yet, the internet driver’s license is not going to stop hackers from hacking information stored on the server.

I will try to explain why…

There are different sites out there which store user information such as their name, address, email, phone number, etc in free form… i.e text on the database server.

The authentication using user name / password or internet driver license (IDL) only provides a gateway to access / edit that information by the owner.

This does not in anyway prevent someone from downloading everyone’s personal information as described above using a sql injection or by hacking into the database admin password.

Therefore, it is important not just to hash/secure the user credentials but also even more important to encrypt a user’s personal information before storing on the server.

Now about passwords. By having a password or IDL (which again is authenticated using a password) in the first place is a security risk.

What if there are no passwords at all… Then there is nothing to hack. If the information stored on the server is encrypted using a seed that is not stored on the server, then hackers will have a nightmarish scenario where they have to first get to know what is the seed (which again if it is unique for each user) and then use it to decrypt the user information one by one.

I feel the current method of authentication is crap and needs a overhaul completely by doing away with passwords or IDL altogether.

Instead leave the seed/key with the user (in the form of a device authentication) or use an identity server that does not store user credentials and yet authenticate users by generating keys in real time as a combination of user credential plus their device identity.

All this sounds crazy, but I have been able to devise such a method to do away with passwords completely. You can try it out a www.0pass.com

Though there is a password field, you can sign in from a registered device without entering a password.

I am considering doing this on my web: no user-selected passwords, account creation and login would look the same. User gives an email address and then they would need to prove their own that email address by clicking on a link in a sent email. The link will set a cookie and the user is logged in while the cookie is valid. No need to rely on 3rd party authentication service, so it is more secure if the link in the email is only valid for a short time.

“Sure, we’re centralizing risk here to, say, Google, or Facebook – but I trust Google a heck of a lot more than I trust J. Random Website, and this really is no different in practice than having password recovery emails sent to your GMail account.”

That’s the heart of it. Trouble is, while I might trust Google’s ethics or tech savvy more than J. Random’s, I can never trust them fully. They aren’t invincible, and centralizing the risk is not worth it (and you are not simply centralizing your own risk you are advocating centralizing everyone’s centralized risk). If I don’t know how to keep my sensitive identities completely separate (including separate email accounts) from my more trivial, throw-away identities then educate me. Don’t force me to centralize my risk. I loath the day where I have to use my Google account or iDriver’s License to log into other web sites. I would much rather have the freedom to set up a separate account with each entity.

While I have an OpenID account and I see it as a good alternative it is far from a perfect solution. There are better solutions out there as some of the commenters already mentioned.

I bloged about it: http://www.cromis.net/blog/2010/12/your-passwords-security-on-the-internet/

@Nishith Prabhakar: “How does one know if the site is safe (storing a salted hash) and not a gawker?”

The most reliable indication is if they let you “retrieve” your password or if they only allow you to reset it. Most f the time passwords are stored reversibly so they can send them out to users who forget them.

Of course, I’m fairly certain Gawker didn’t allow password retrieval either. They were the special kind of incompetent that didn’t have a requirement for password retrieval yet stored them insecurely anyway.

@Adam Rosenfield (and others): “Gawker did NOT store passwords. You are flat-out wrong there, Jeff. They stored the standard DES hashes of passwords as computed by crypt($password, “xy”), where “xy” is a random two-character salt (<a href=“http://php.net/manual/en/function.crypt.php).””>http://php.net/manual/en/function.crypt.php)."

DES is reversible encryption. Yes, ‘crypt’ uses the password as the seed rather than as the encrypted value, so there are theoretically multiple possible decryptions for each “hash”, but the entropy there is incredibly low. Almost all of the Gawker password hashes have been reversed at this point. DES reversal is just a matter of time and computing cycles, then throw out the ones with characters the keyboard can’t easily produce.

That having been said: yes, if you are just going to brute-force, you can guess the passwords of the strength seen in Gawker’s database just as easily had they been storing 5xMD5 hashes or similarly non-reversible storage approaches. No matter how well a site stores passwords, if your common password is “123456” or, apparently, “monkey” (???), it will get guessed and verified with such a database dump.

What Gawker did was expose the guy who has the non-common brute-force-resistant password like “tiaucpw4ts” (this is an un-crackable password 4 this site). That guy’s doing everything “right”, but if he went to the trouble to do things “right” with that password he is probably using it on more than one site; Gawker just gave away that password everywhere.

I haven’t really given much though to using an “internet driver’s license” type log-in. However I do see your point and think it’s a very good one. I do wonder though…

Is it wise to give one company that much control over all of the sites you visit? What happens if one of the companies gets hacked? Goes down? decides to sell, etc?