Part seven of the series detailing the OWASP top 10 web application vulnerabilities with a focus on password hashing. (See intro)

"Insecure cryptographic storage" relates to a number of aspects, but I think that it can be broken down to two main areas: Encryption and Hashing.

As these are similar in some respects and are often both used together, there's a bit of confusion around what they are.

Firstly, encryption uses a mathematical formula to transform human readable data into an unreadable form by means of a key. Often encryption is a symmetric process. That is, the same (or trivial) key is used to lock (encrypt) the data as to unlock the data. This differs from asymmetric (or public-key) encryption where there are two different keys employed - One for locking and the other for unlocking. One constant in encryption is that there is a key which must be kept safe. This key is employed by means of a sequence of data and may be stored in a file on the server if needed continuously by a computer program. This obviously implies that anyone who can break into the server and get access to the key can unlock all the data.

Hashing is similar to encryption in that it transforms data from a human readable form into an unreadable form via a mathematical function. The primary difference between the two is that hashing is only a one way function. In other words, given the hash (or resultant) code, nobody should be able to work out the original data. An example of a hash value for the entered phrase "test" using the Md5 hash algorithm is: 098f6bcd4621d373cade4e832627b4f6

If you can't ever retrieve the original data what use is this? One of the common uses is for securing passwords. The way in which it works can be explained by means of an account registration and login example. Upon account creation, the password is hashed, thus giving a block of unreadable data. This is then stored in the database as the "password". When the user enters their password during the login process, the entered password is once again hashed and then the two hashed values are compared. If they are the same, the user entered a valid password. Note how the password in human readable form is never needed to determine if the user has entered the correct password. So, even if a hacker got access to the system's source code and the hashed passwords, due to the fact that a hashed password can't be reversed, it is theoretically impossible to crack someone's password. Not quite...

There are a number of techniques employed in cracking passwords. Firstly dictionary attacks take a dictionary of words and try each one sequentially until a match is found. They would also try combinations of words, or words with prefixed and/or appended numbers. As it is much simpler to remember a name or word, people invariably choose simple passwords and therefore the dictionary attack is amazingly effective. This highlights the importance of having a minimum strength password policy in place, forcing a user to select a password with a combination of uppercase letters, punctuation and both alpha and numeric values therein.      

The other widely used approach is the use of rainbow tables. Basically, a hacker has a stored table of data which in essence contains two things; passwords and the hashed value for each password. Additionally, these hashed values are indexed in the database which makes it very quick to simply look up a given hash value and determine the corresponding password. This approach uses the time/memory trade-off as these tables are very large but allow much quicker cracking of hash values.

As an example of how easy this can be, using the hash example above, the hash value 098f6bcd4621d373cade4e832627b4f6 can be "broken" in seconds using online web based tool at:  http://www.md5rainbow.com/

The way to defeat the rainbow tables is to add a salt to the hash value. A salt is a random set of data that is appended to the given password which makes the cracking of the password unlikely by means of lookup tables. 

As an example, a 3 letter password containing only letters could have 17576 different possibilities(26x26x26). If another 3 letters were added before hashing, such that the final string = salt + password, there would be 308915776 permutations. The resultant space that would be required to calculate all the possibilities for all salts becomes exponentially greater as a longer salt is used. This renders generating a pre-compiled table infeasible.

Once the concatenated string has been hashed, both the salt and the hash are stored in the data store. The salt will be used later again to validate that an entered password is correct by using the same salt, hashing the resultant string and comparing it to the stored value. The storage of the salt with the password may sound counter intuitive, but it's sole purpose is to eliminate the possibility of using a pre-compiled table to crack passwords.

One more point about a salt, is that every salt must be unique. If all your records are hashed with the same salt then a determined hacker would only need to regenerate a single rainbow table for the given salt and then lookup any password. If every salt is unique, then the hacker would have to regenerate the table for every hashed password, making it much more difficult.  

Finally, there are a number of different hashing algorithms and some are better suited to particular jobs than others.

Some of the more common ones include: SHA256, MD5  and WHIRLPOOL. The SHA family of hash algorithms are probably one of the better general purpose algorithms to use, but check that any hash algorithm that you choose to use is secure.


Twitter Delicious Facebook Digg Stumbleupon Favorites More