We've read so many sad stories about communities that were fatally compromised or destroyed due to security exploits. We took that lesson to heart when we founded the Discourse project; we endeavor to build open source software that is secure and safe for everyone by default, even if there are thousands, or millions, of them out there.
Great post, but I have a comment on the idea of putting 8 GPU cards in a machine, within reach of a small business budget… Please have a look at the smallest dutch super computer, the Little Green Machine, with 4 machines with each 4 GPU cards. Looks like a little bit more complicated to build than a wealthy business person could do. Still impressive performance. Link to the Little Green Machine I have read another site with even more technical details, but do not have the URL at hand, sorry. Still remember that the interconnect with InfiniBand is one of the main cost factors.
A hash type table? You don’t need to reinvent the wheel. The crypt hash format already specifies type, salt, and difficulty as determined by the algorithm. Just use bcrypt, and when you need to strengthen passwords, rehash on login and raise the rounds parameter by 1 (it’s 2^rounds). There are plenty of other crypt algorithms as well – $5$ (Linux SHA-256) and $S$ (Drupal) to name a few. Passwords aren’t hard unless you’re unwilling to use libraries other people have made.
You are right in that these numbers show that the security should be further improved. But the hacking example clearly shows that, no matter how many password requirements you implement, most of the password length is wasted by common words and names, and the extra chars are often like “123”. If we switch to automatically generated random passwords, we could do with much shorter ones. According to An Administrator’s Guide to Internet Password Research, also published in CACM (nov 2016) and presented at Usenix (2014), it is sufficient to have a password space of 10^6, i.e. 5 chars, combined with throttling for password guessing hackers
Having a strong algorithm is nice, but why not just wrap the hashes?
It’s much harder to guess a hash’s algorithm if it was wrapped in several other algorithms and it needs custom tools to bruteforce the hashes instead just using hashcat.
You might want to add a few canary users to the database. These would be users that don’t actually exist, with passwords that are easy to crack. If anyone ever tries to log into one of those, you know your database has been breached.
The nightmare scenario explored here is when the database is leaked though, whereas the paper is talking about throttling the login endpoint. If an attacker has the DB, the only “throttling” that can be done is via a more time-consuming hashing algorithm.
Mark, I’m no database expert (I’m about 2/3 of my way through a bachelor’s degree in computer science), but I’m thinking your idea is actually a great one - with a tweak: don’t have a few canary users… have many canary users. As in pow(2, n) - 1 canary users for every real user.
If the username lookup is a binary search, this adds n steps to the lookup time (which is going to be trivial), at the cost of bloating the database somewhat. But disk space is getting pretty inexpensive these days…
(I admit, maintenance of this would be a pain… some analysis could separate fake users from real ones if it isn’t done right.)
One aspect that you left out here: GPUs might not be your worst enemy, FPGAs can also be used to crack passwords efficiently, not to be underestimated if you are expecting a dedicated attacker. And while bcrypt has been designed to be inefficient on GPUs, it can still be cracked rather efficiently on FPGAs. That’s the issue that scrypt is meant to address.
Also, the comparison in your post is misleading: the number of hashes per second means nothing if you don’t mention the corresponding number of iterations. All algorithms allow tweaking this, to make the task more computationally intensive as the available hardware improves with the time.
Is argon2 considered proven, battle tested, and ready for wide scale adoption? The wikipedia page on it is unclear. I don’t think crypto people like to use things that are too new for that reason…
I think you misunderstood what I meant. To change the hash we use at Discourse we have to support both the old and new formats in the Discourse code (as well as any future formats).
I guess that is sort of true but it feels to me like security through obscurity. Just increase the work factor.
Definitely a neat idea, but it’d need to be a unique canary user per site.
That’s fine, and overlap with bitcoin specific hardware is definitely bad, as in, there may be some monster hashing hardware out there. Do you have any links of where to buy it, numbers, etc? The main takeaway I have is “don’t pick any password hash that remotely resembles what bitcoin uses” as that is a rich vein of madness.
This is covered several times in the post. Search for “there are two factors that go into password hash strength” if you don’t believe me.