Dictionary attacks on keys
Categories: cryptography
Search Spaces
In a previous section we mentioned the possibility of using a 4 digit PIN to generate a key. This works perfectly well, for example for example "4578" yields a key of
C289D44D06BAFB6C7B4AA194857CCBC
if we use MD5 as the hash. This key looks impressively random and uncrackable, but of course there is a serious problem. A user is only allowed to select a 4 digit PIN, and there are only 10,000 possible 4 digit PIN's (0000 through to 9999). Of all the possible 128 bit codes available, only 10,000 will actually ever be used, because these are the only ones which can possible be created from a 4 digit PIN.
Worst still, a smart attacker can quite easily calculate all 10,000 possible keys. If he then intercepts a message, he only needs to try decoding it using each of the 10,000 possible keys and he is guaranteed to break the encryption. Trying 10,000 keys is absolutely trivial using an ordinary PC. The set of possible keys which can be generated from the available passwords is called the search space. If the search space is too small, the key is not secure.
If a system allows longer passwords, and more variety of characters, the search space is increased. For example an 8 character password, which allows upper and lower case letters, plus numerals, has a much larger search space. Each character has 62 possible values (a-z, A-Z and 0-9), and there are 8 characters so the total number of possible passwords is 62 to the power 8, or 218,340,105,584,896.
Clearly this is a much larger search space, equivalent to a 48 bit key, but even this is not anywhere near as large as the number of possible 128 bit keys. The lesson here is that to obtain true 128 bit security you need a search space which is comparable to the number of possible keys. This requires a long password (about 21 characters, if they are selected randomly).
It should be no surprise to learn that very few security systems insist on a password of 21 totally random characters - many are happy to accept a 6 character password. That is fine in some cases, where the consequences of a hacker breaking the encryption are not serious. If 128 bit security is really needed, then it is important to ensure that your rules for acceptable passwords create a large enough search space. If not, you are simply deluding yourself about the security of your encryption.
Dictionary Attacks
Even, if you allow people to use long, random passwords, some users of a system are likely to choose easily guessable passwords. These include everyday words, names of people, places or pets, dates of birthdays etc.
If an attacker creates a list of potential passwords, he can then apply the key derivation algorithm to every item in the list to create a corresponding list of the binary keys associated with each of the passwords. The first few entries of the list might look like this:
aardvark 88571E5D5E13A4A6F82CEA7802F6255 aaron 449A36B6689D841D7D27F31B4B7CC73A abacus 13F27B1072BBF7719DD267B83FF91C etc...
A small list of, for instance 10,000 words could contain most commonly used English words. A larger list, eg a million entries, could contain virtually every word, name or date, as well as common variations. Anything other than a genuine random password could well be included on such a list, and would therefore be vulnerable to a dictionary attack.
Now, whenever the attacker intercepts an encrypted message, he just needs to try decrypting it with every key in the dictionary, and if the user was foolish enough to use a guessable password, the message will be decoded. With just a million keys to try, the task is equivalent to breaking a 20 bit cipher - far, far easier than breaking a 128 bit algorithm such as Rijndael.
Notice that this attack is subtly different from the search space problem. If the search space is small, then an attacker can create a dictionary which will guarantee that he can decrypt every message. In a dictionary attack, assuming the search space is large, the dictionary only includes the most likely keys. There is no guarantee of success, but the attack can be repeated against every single intercepted message in the hope of finding one from a user with a weak password.
The best defence against a dictionary attack is to choose a non-guessable password, preferably a long password which is not based on any real word or name. Better still is a passphrase, consisting of several words. However, within an organisation, it is necessary to ensure that everyone uses a secure password, because if an attacker is able to intercept messages and perform a dictionary attack, any vulnerable messages will be easily cracked.
The technique of salting, discussed in the next section, goes some way towards making life a little more difficult for the would-be attacker, but it does not remove the necessity to use secure passwords.
See also
- Symmetric encryption
- Applications of symmetric encryption
- Symmetric block ciphers
- Symmetric encryption algorithms
- Cryptographic modes
- Block padding methods
- Attacks on symmetric ciphers
- Cryptographic hashes
- Strong hashing functions
- Applications of hashes
- Common hash algorithms
- Attacks on hash algorithms
- Iterative hashes
- Message authentication codes
- Common MAC algorithms
- HMAC algorithm
- Key derivation
- Key derivation using hash functions
- Salting
- Key derivation using random number generators
- Key derivation standards
Sign up to the Creative Coding Newletter
Join my newsletter to receive occasional emails when new content is added, using the form below:
Popular tags
555 timer abstract data type abstraction addition algorithm and gate array ascii ascii85 base32 base64 battery binary binary encoding binary search bit block cipher block padding byte canvas colour coming soon computer music condition cryptographic attacks cryptography decomposition decryption deduplication dictionary attack encryption file server flash memory hard drive hashing hexadecimal hmac html image insertion sort ip address key derivation lamp linear search list mac mac address mesh network message authentication code music nand gate network storage none nor gate not gate op-amp or gate pixel private key python quantisation queue raid ram relational operator resources rgb rom search sort sound synthesis ssd star network supercollider svg switch symmetric encryption truth table turtle graphics yenc