Salting
Categories: cryptography
The purpose of salting is to make dictionary attacks more time consuming, and if possible infeasible, to an attacker.
Salting
To implement salting, a system needs to store a unique, random string for every user or account on the system. This string is called a salt, and every time the system derives a key it appends the salt onto the password first. For example, suppose the system has assigned a salt "texuxosaceby" to User1. Whenever the system needs to derive a key for User1, it will append "texuxosaceby" to the password. If User1 enters a password of "aardvark", the key will be derived from "aardvarktexuxosaceby".
Similarly, if User2 has been assigned a salt "udawyfijagok", and types in a password of "abacus", the key will be derived from "abacusudawyfijagok".
It is important to realise that the user does not need to know anything about the salt value. All the user needs to do is enter the password. A system which implements salting will usually create and maintain salt values behind the scenes, and store them in some kind of database so that they are available wherever they are needed to derive a key.
The purpose of salting is to ensure that an attacker cannot use the same dictionary to attack every user - he must derive a new dictionary for every user (because every user has a different salt value). This can make the attack very time consuming. For example, suppose it takes 1 second to derive a key on a normal PC - that is not an unreasonable time to wait to log on, in most cases. For an attacker, to generate a dictionary of 1 million keys would take almost 12 days (or less if he uses a bank of PC's, or a more poweful computer). Without salting, this dictionary can be used to attack every user on the system (or indeed, every user on any system which uses the same key derivation technique). This is probably a good investment of 12 days of computing power, as far as the attacker is concerned.
With salting, the attacker must repeat this calculation for every single user, just in the hope that he might eventually find a user with a weak password. This is clearly a much less attractive option, especially combined with a user base who are encouraged to use strong passwords.
An additional advantage of salting is that the attacker needs to find out the salt values for each user before any attack is possible. This presents an additional hurdle for any attacker. However, ultimately these salt values are stored on a computer database, and so can never be totally secure. You cannot rely on the salt values being kept secret, but it certainly does no harm to protect them as far as possible.
Iterated calculations
Salting relies on the fact that it is reasonably difficult to create a dictionary of around a million entries. We previously gave the example that if a single pass of the key derivation algorithm takes 1 second (on a normal PC), then it would take 12 days for an attacker to create a dictionary on a similar PC.
In fact, it doesn't take anything like 1 second to calculate a simple hash on a modern PC. A current PC could easily compute thousands of hashes per second, bringing the time to create a million entry dictionary down to just a few minutes. This more or less defeats salting, and computers can only get faster as time goes on.
To get around this, we modify the algorithm to make it more time consuming to calculate. We combine the salt and the password, and calculate the hash as normal. Then we calculate the hash of the hash, and use that as the key. In this way, the key derivation takes almost twice as long, so the attackers job is twice as difficult.
Of course there is no need to limit ourselves to 2 iterations, we could calculate the hash of a hash of a hash ... a thousand times or more if necessary, until we get a key derivation function which is slow enough to cause a problem for the hacker (but still fast enough to be useable by ordinary users).
There are a couple of assumptions about the hash function used here. First, we are assuming that applying the hash over and over will not result in any form of convergence (ie, all passwords end up create one of a small number of keys). We are also assuming that there is no shortcut available to allow the result of many iterations of the hash to be calculated more easily. As far as is known, with hashes such as MD5, SHA and RIPE, these assumptions are true.
Of course, whatever iteration count you choose now might be inadequate in several years time as computing power improves. It is a good idea to include the iteration count in a database, along with the salt value, for each user. This means that in the future, it will be possible to increase the iteration count, if necessary to keep up with increasing computer speeds. Normally, a change of salt and iteration values would take place at the same time as the user changed their password as part of routine security.
See also
- Symmetric encryption
- Applications of symmetric encryption
- Symmetric block ciphers
- Symmetric encryption algorithms
- Cryptographic modes
- Block padding methods
- Attacks on symmetric ciphers
- Cryptographic hashes
- Strong hashing functions
- Applications of hashes
- Common hash algorithms
- Attacks on hash algorithms
- Iterative hashes
- Message authentication codes
- Common MAC algorithms
- HMAC algorithm
- Key derivation
- Dictionary attacks on keys
- Key derivation using hash functions
- Key derivation using random number generators
- Key derivation standards
Sign up to the Creative Coding Newletter
Join my newsletter to receive occasional emails when new content is added, using the form below:
Popular tags
555 timer abstract data type abstraction addition algorithm and gate array ascii ascii85 base32 base64 battery binary binary encoding binary search bit block cipher block padding byte canvas colour coming soon computer music condition cryptographic attacks cryptography decomposition decryption deduplication dictionary attack encryption file server flash memory hard drive hashing hexadecimal hmac html image insertion sort ip address key derivation lamp linear search list mac mac address mesh network message authentication code music nand gate network storage none nor gate not gate op-amp or gate pixel private key python quantisation queue raid ram relational operator resources rgb rom search sort sound synthesis ssd star network supercollider svg switch symmetric encryption truth table turtle graphics yenc