Key derivation using hash functions

By Martin McBride, 2017-04-09
Tags: cryptography key derivation hashing
Categories: cryptography

The purpose of a key derivation algorithm is to take a password (which is usually a variable length string of ASCII characters) and convert it into an encryption key (a binary string of a specific length).

A [[data formats:cryptography:Cryptographic hashes|hash]] algorithm performs exactly this job - it creates a fixed length binary signature for an arbitrary input message. In this case the password is used as the message and the resulting signature is used as the encryption key.

For example, the password "aardvark" yields the hash code (using the MD5 hash algorithm):

88571E5D5E13A4A6F82CEA7802F6255

image

This is quite acceptable for use as a 128 bit key, although it is not advisible to choose a dictionary word (like aardvark) as a password. It is susceptible to a [[data formats:cryptography:Dictionary attacks on keys|dictionary attack]] or just a wild guess.

If your system is based on a PIN code (even less secure than a password, from the point of view of dictionary attacks), this can just as easily be converted to a binary key, for example "4578" yields:

C289D44D06BAFB6C7B4AA194857CCBC

For a more secure system, where the user is permitted to enter an entire phrase, the phrase "Oliver thinks that passphrases are ... more secure!" gives us the key:

3664C1D79B6845FB8464E35B4ECD8

This is a lot more difficult to crack. All the words on the phrase are dictionary words or common names, but an attacker would have to get them all in the right order, and get the punctuation correct.

Matching an exact key size

There are many different hash algorithms, and they generate various different hash lengths. You might be able to find a hash algorithm which generates the exact key size you need. But what if you need a specific size and there is no algorithm which creates that size hash? And perhaps you don't want to keep changing to a different hash algorithm each time you need a different key.

As a general rule, it is safe to choose a hash which is longer that the key size you need, and then truncate the hash signature so that it matches the required key size.

For example, suppose you are using 3DES encryption, which requires a 192 bit key. You might decide to use SHA256 as the hash in your key derivation function. SHA256 produces a 256 bit signature - it is considered safe to truncate this to 192 bits, by taking the first 192 bits and discarding the remainder.

Of course, there is nothing special about the first 192 bits. You could use the final 192 bits, or the middle 192 bits, etc, there is no particular benefit or disadvantage in terms of security. In practical terms, it is often easiest just to take the first 'n' bits and discard the rest.

Unfortunately, it is not so straightforward to produce a key which is longer than the hash length. You need to choose a hash algorithm which has a length which is at least as large as the required key length. This is not usually a problem, because SHA and RIPE hashes can both provide hash lengths of up to 512 bits, which is more than enough for any current mainstream encryption algorithm.

If you did need a longer key for any reason, there is an alternative technique, based on [[data formats:cryptography:Key derivation using random number generators|pseudo-random number generators]].

See also

Sign up to the Creative Coding Newletter

Join my newsletter to receive occasional emails when new content is added, using the form below:

Popular tags

555 timer abstract data type abstraction addition algorithm and gate array ascii ascii85 base32 base64 battery binary binary encoding binary search bit block cipher block padding byte canvas colour coming soon computer music condition cryptographic attacks cryptography decomposition decryption deduplication dictionary attack encryption file server flash memory hard drive hashing hexadecimal hmac html image insertion sort ip address key derivation lamp linear search list mac mac address mesh network message authentication code music nand gate network storage none nor gate not gate op-amp or gate pixel private key python quantisation queue raid ram relational operator resources rgb rom search sort sound synthesis ssd star network supercollider svg switch symmetric encryption truth table turtle graphics yenc