One of the biggest concerns around managing the passwords of an organization’s employees lies in how to store those passwords on a computer.
Keeping every user’s password in a plain text file, for example, is too risky. Even if there are no bugs to recklessly leak the passwords to the console, there’s little to stop a disgruntled systems administrator taking a peek at the file for pleasure or profit. Another line of defense is needed.
Let’s hash it out
Back in the 1970s, Unix systems began to ‘hash’ passwords instead of keeping them in plain text. A hash function is used to calculate a value (like a number) for each password or phrase, in such a way that, while the calculation itself may be easy, carrying out ‘in reverse’ – to find the original password – is hard.
By way of illustration, suppose we take an English word, and assign each letter a value: i.e. A=1, B=2, C=3 and so on. Each adjacent pair of letters in the word is then multiplied together, and added up. The “hash” of the word is this total so, using this method, the word BEAD has a hash value of (BxE) (ExA) (AxD) = (2×5) (5×1) (1×4) = 19. FISH scores 377, LOWLY scores 1101, and so on.
Using this system, the password file would store a number for each user, rather than the password itself. Suppose, for example, the password file entry for me has the number 2017. When I log in, I type in my password, the computer carries out the calculation above and, if the result is 2017, it lets me in. If, however, the calculation results in another value, access is denied.
As all that’s stored in the password file is the value 2017, and not my actual password, it means that if a hacker steals the entire contents of the file, there is still a puzzle to solve before they can log in as me.
Although hashed passwords may be more secure than plaintext, there still remains a problem. The aim of a dictionary attack is to obtain a list of all English words and calculate their hash values, one by one; if my word is in there, it will be found eventually. However, while this may sound like a painful amount of work, the point is that it won’t just crack my password – it will crack every password.
An index is created in such an attack, which is then sorted by hash value, with individual words added to the index as their hash values are calculated: BAP goes on page 18, for example, BUN goes on 336, and CAT on page 23. ‘Reversing’ the hash function is then just a matter of looking up the word in the index – simply turn to page 2017 and you’ll find my password.
During World War II, the cryptanalysts at Bletchley Park did literally that: they worked out every possible way in which the common German word ‘eins’ could be enciphered using the Enigma machine, and recorded the Enigma settings as they went. The results were then sorted alphabetically into the so-called ‘eins catalogue’ meaning that, if the codebreakers could guess which encrypted letters represented the plaintext ‘eins’, they were then able to simply rummage through a battered green filing cabinet and pull out the key.
Salt in the wound
The next layer of defense against a dictionary attack is to use what’s called salt. A random variation to the calculation is applied differently for each user’s password in a salted hash scheme. One user could have A=17, B=5, C=13, and so on, for example, and another could have A=4, B=22, C=17. The password file would then store the salt (the A, B, C values) and the hash result. The computer could still carry out a quick calculation to check the password, but the variation means that the same password would have a different hash value for a different user.
It would therefore be impossible to compile a single dictionary that could successfully reverse the hash for everyone.
Finally, the best modern systems use a so-called iterated hash. The idea of this is to make the hash function itself harder to calculate by re-hashing the data thousands of times. This does slow down the computer checking the passwords, but anyone trying to search for a password will also be slowed by the same factor. The end result is essentially a computing power arms race between system administrators and hackers although, if you’re Amazon or Microsoft, it’s a fight you’re well placed to win.
Protecting user passwords is critical to the security of an organization’s confidential files and information. It’s vital therefore that steps are taken to protect passwords, encrypting them to such a degree that even the most determined criminal will find it impossible to decipher.