If you are reading this, chances are that you are wise enough not to use “123456” and “dragon” for your password. Even for the folks who are somewhat paranoid about online security and believe they are indeed using “strong” passwords, there is a still a chance of improvement (with a better understanding of the science behind). This post scratches the surface on “state of the art research for choosing a secure password”. The idea is to document the key lessons and provide pointers to academic journals to dive in deeper.
Our Assumption for the “Threat Model” : We assume that the attacker has obtained access to a list of hashed passwords. They can easily launch offline attacks in their own sweet time as an attempt to crack these passwords. Also most users tend to reuse passwords across different accounts, which helps an adversary maximize the possible damage.
How strong is your password?
A strong password is not easily “guessable”. Thus the time needed for some of the most efficient password-cracking algorithms to reveal the plaintext password can quantify a password’s strength.
Here are some of the findings from the security community you can use for a quick self assessment. We discuss the all issues more clearly subsequently.
- Any dictionary based password is inherently insecure.
- As a thumb rule, having a longer password makes it more secure. That said, it is still quite possible that a user with the best of intentions chooses a long password (non dictionary) which is still easy to break
- Password that follow a “predictable” structure are more insecure.
- Multiple websites force a password expiration policy. Most of the users when choosing a new password, tend to generate future passwords incrementally based on old passwords. Most of the time, these new passwords are a few simple tweaks on the old ones. (Almost everyone does this, right?)
So how do the bad guys exploit the weakness?
In an ideal world, every user chooses a perfectly random password. Perhaps something like : “cmMs&59my@6Ue*T” or “3%RaKYn5zMBB&9P”. Randomness is indeed our friend. There is a just small flaw, these password are incredibly hard to remember.
Not so surprisingly, dictionary based passwords and attacks are most common. Any attacker will try to guess the password by initially using a set of dictionary words. Even most of the lengthy non dictionary passwords usually have a predictable structure with a non-uniform (in simple words: predictable) distribution of characters. Thus the most efficient cracking tools use heuristics and do not explore all possible passwords(brute force).
They instead use clever techniques like :
“For example, in English, ‘w’ is more likely to be followed by ‘h’ than by another ‘w’. The concept can be extended to character sequences of arbitrary length, called n-grams; the 3-gram (or tri-gram). ‘thr’ is far more likely to be followed by a vowel than by a consonant. By training the Markov chain on known password lists, dictionaries, or both, these distributions can be estimated and used to generate a list of possible passwords that is significantly more effective than random guessing or the publicly available programs like John the Ripper”
Probabilistic Context Free Grammars
“This algorithm is based on the observation that passwords tend to have predictable “structures”. The structure of a password is defined as the way in which the password can be broken into strings(or tokens) of letters, digits, and symbols (e.g. S1U1L6D2 represents a special character followed by an uppercase letter,then 6 lowercase letters, and ending with two digits). PCFG attacks are highly effective because when a password composition policy requires that a digit or symbol be included in a password, users are far more likely to append the required character to an existing password rather than place it in the middle. Similarly, uppercase letters are predominantly used at the beginning of an alphabetic string. By tabulating the number of occurrences of each distinct structure found in a training set, an attacker can gain valuable insight into the likely distribution of structures of the passwords.”
Thus even with strict password policies enforced by websites, PCFG based attacks can be highly lethal, exploiting the bias users have when choosing passwords.
Transformation based Algorithmic Attacks
These attacks when targeted on an individual user, make small modifications to existing leaked passwords, exploiting incremental nature and making the attacks more efficient.
Choosing a strong password has quite a few associated nuances. It is simply not a choice between a shorter and a longer passphrase. It takes a little more thought and effort to ensure our passwords are indeed strong and resistant to offline attacks.
Suggested Readings :
References : 1