How Are Passwords Securely Stored?

Password Storage

When you register an account on a website, a SQL database stores the password. These passwords aren’t securely stored too often, and journalists have caught some big companies storing passwords in plain text. Some sites are marginally better because they encrypt passwords before storing them in a database. This doesn’t provide much additional security if the decryption key and database are on the same server.

Anyone worth their salt (pun intended) will use more robust security methods, of which there are two: hashing and hashing with a salt. Hashing involves using a function that maps the password to a fixed-size array called the hash digest. Such a function has the property of being one-way. This means it’s nearly impossible to invert the function to reach the initial input space from the digest. This is stronger than the NP-hard complexity class because the problem is practically impossible in the average case, not just the worst case. The weakness of password hashing is that attackers can use potentially an index of keys and their associated digests, called a rainbow table, to find a given password. An attacker can take the digest of a password they want to crack and search for it in a rainbow table and they can gain access if they find a match.

Hashing and salting is a more secure version of hashing that is immune to rainbow tables. A salt is a random string that the algorithm appends or prepends password before hashing. Adding a unique salt to every password makes it extremely difficult to find passwords using a rainbow table. It also means that two users using the same password have unique hashes when people use salts.

Algorithms

Many hashing algorithms are woefully inadequate for use in password hashing because they’re too fast. This includes algorithms like MD5 and SHA-1, which haven’t been widely used for password hashing for more than 15 years. There are a few hashing algorithms in wide use for password hashing

SHA-2

These produce considerably longer hash digests than SHA-1 (224-512 bits vs 160 bits). The SHA-2 (256) algorithm works like this:

  1. It takes an input string $ M $ (your password) and partitions it into 512 bit blocks
  2. It pads the input string with information. This includes
    • An appended ‘1’ bit on the input with a certain number of zeroes $ k \in \mathbb{Z}^{+} $, where k is the solution to $ \ell + 1 + k \equiv 448 :\text{mod}: 512 $
    • The length of the input string $ \ell $, with $ 0 \le \ell \le 2^{64} $
  3. It initializes a hash $ H^{(0)} $ with eight 32-bit hexadecimal words that are the first 32 bits of the fractional portion of the square roots of the first eight prime numbers
  4. It processes the blocks a schedule array for each individual block, which has 64 entries that contain 32-bit words (variables)
  5. It processes the variables with the compression function shown below:

sha-2_compression

  1. It determines the hash digest by adding the variables $ A $ to $ H $ (mod 32) (the compressed chunks) to the corresponding initial hash value, iterating this process over all message blocks. Successively appending each block in order after the completion of this process gives the final digest.

Most linux distributions have used SHA-512 for password hashing since at least 2007.

Bcrypt

Bcrypt is a slow hashing algorithm. This means it can weaken the efficacy of brute-force attacks by increasing the iteration count of the function. Unlike SHA-2 and SHA-3, bcrypt adds a salt to the hashed password. This is a two-part algorithm that hashes, then salts a given password. It works as follows:

  1. It starts with a setup function takes three inputs
    • The password
    • salt (random 128-bit string)
    • cost (a number between 4 and 31)
  2. It outputs an array of 18 subkeys associated with 4 substitution boxes $ S_1 \ldots S_4 $ that the hexadecimal digits of $ \pi $ initialize
  3. It mixes the password in the subkey area with the exclusive or (XOR) function.
  4. It splits the salt into two parts (lower and upper) and initializes a 64-bit zeroed-out block. It mixes this block with the lower and upper salts for 8 iterations with the XOR function, alternating each time.
  5. It uses the state structure (the subkey array and substitution boxes) to encrypt the block with salt and are stores the results in the subkey array.
  6. It mixes the now encrypted block into the internal substitution boxes of the state.

This part is the key derivation function and key expansion part of bcrypt. The former derives subkeys from a primary key (our password) and the latter hardens the password against brute force attacks by increasing entropy (making it more complex). Now for salting:

  1. The blowfish cipher encrypts the string “OrpheanBeholderScryDoubt” 64 times with the state obtained from the first part. This returns the digest, which is the 128-bit salt, cost, and resulting cipher text from the now finished encryption concatenated together. It takes the form
$2b$[cost]$[salt base64 encoded to 22 characters][192-bit hash base64 encoded to 31 characters]

This is a rough visual diagram of the process elucidated above:

bcrypt

Many operating systems use bcrypt for password hashing, notably OpenBSD. Unfortunately, while more robust in some respect than PBKDF2, it lacks sufficient memory-hardness. A 2014 paper detailed an attack on bcrypt using ARM based FPGAs.

PBKDF2

PBKDF2 is a key derivation function that accepts 5 variables:

  • A password
  • Desired number of iterations $ c $
  • A salt
  • A pseudorandom function (such as an HMAC, which is a piece of information verifying the authenticity of a message coupled with a hash function)
  • Output key bit length (klen)

The key derivation function subsequently obtains a derived key, expressed as

$$ \text{DK} = T_1 \Vert T_2 \Vert \ldots \Vert T_{\text{klen/hlen}} $$

where hlen is the output length of the pseudorandom function, the bars are concatenation symbols, and every $ T_i $ is a block of length hlen. The value of hlen is found by chaining together the outputs of the pseudorandom functions c times.

pbkdf2

Like bcrypt, PBKDF2 is a slow algorithm. One concern of using is that while it’s taxing on a CPU, it’s relatively easy for a modern GPU, which can take advantage of parallelization.

Argon2

Argon2 is a family of key derivation functions. For the sake of brevity, we’ll only look at Argon2id, which is the mainline variant. It takes a balanced approach by protecting against both side-channel attacks and time-memory tradeoff attacks fairly well. Argon2i is nearly impervious to side-channel attacks, but it’s more vulnerable to time-memory tradeoff attacks; Argon2d is the inverse of Argon2i. Argon2id runs Argon2i over the first half of the initial iteration over the memory, then as Argon2d for the remaining duration. Argon2 function accepts the following input parameters:

  • The amount of memory to be used $ m $ (in kilobytes)
  • Salt length $ s $ (128-bit is usually the default)
  • Number of iterations $ c $
  • Number of threads to be utilized $ t $
  • Password $ p $
  • Key length $ L $

It consists of two main functions:

  • An internal compression function $ G $ that takes two 1 KB inputs and outputs a single 1 KB value
  • The BLAKE2b hashing function $ H $

Argon2 works as follows:

  1. It initializes a 512 bit block called $ H_0 $ by concatenating the inputs together
  2. It encrypts the block with encrypted with BLAKE2b.
  3. It allocates memory into 1 KB blocks; the number of blocks used is denoted by

$$ m’ = 4t\lfloor \frac{m}{4t} \rfloor $$

  1. It allocates the blocks in an array $ B[i][j] $ with the number of rows being equal to utilized threads $ t $ and columns being the number of blocks divided by the number of threads
  2. It computes the columns of each row. Argon2 repeats this process $ c $ times
  3. It computes the first two blocks as follows, which yield two separate 1 KB digests:

$$ B[i][0] = H\left( H_0 , \Vert , 0 , \Vert , i ,, 1024\right) $$

$$ B[i][1] = H\left( H_0 , \Vert , 1 , \Vert , i ,, 1024\right) $$

For the remaining columns of each row,

$$ B[i][j] = G \left(B[i][j-1], B[l][z]\right) $$

This is where Argon2 trifurcates: the choice of indices $ l $ and $ z $ for each $ i $ and $ j $ depends on which variant of Argon2 you choose.

Argon2 partitions the matrix into 4 slices to enable parallel block computation. Over successive iterations, it identically repeats steps, but the new blocks are XORed with the old value.

Argon2 computes the final block $ C $ by XORing the last column in each row. $ C $ is then used to produce the final digest:

$$ T = H \left(C, T\right) $$

Pertinent Features

When deciding on a hashing (and salting) algorithm, there are some security features that you must consider. These include

  • Preimage resistance
  • Mitigation against pre-computation attacks (i.e rainbow tables)
  • Resistance to birthday attacks
  • Resilience against length extension attacks

Let’s parse these now. Preimage resistance is the concept that given the entire possible input space for a given hashing function (its preimage), it should be computationally improbable to obtain any particular input for a given output. There is a related concept called second-preimage resistance:

Given some input message $ m $, it should be extremely difficult to find some other input $ m’ $ such that $ h(m) = h(m’) $

where $ h $ is some cryptographic hash function. These hashing algorithms, as well as old hashing algorithms such as MD5, are all preimage resistant.

Algorithms that use random salts can thwart pre-computation attacks. This includes bcrypt, PBKDF2, and Argon2.

Birthday attacks and other similar attacks leveraging hash collision leverage the pigeonhole principal: If we have $ m $ objects and $ n $ receptacles, where $ m \gt n $, then at least one receptacle will have more than one object. If a cryptographic hash function has a preimage that is larger than it’s image (the entire possible output space for a function), it’s vulnerable to a collision attack. This is because two or more passwords would have to map to one hash digest. The birthday attack involves trying to find some collision

$$ h(m) = h(m’), ,m \ne m’ $$

There are perfect hash functions (one that is injective), but they’re practically impossible to create in the context of password hashing algorithms because the input space (the set of all keys) isn’t a fixed size. Thus, such a collision is at least theoretically possible in most instances, though not always feasible. The attacker will randomly or pseudorandomly generate two inputs ($ m $ and $ m’ $) and evaluate them for $ h $.

The nature of the birthday problem makes this method plausible for finding a collision. Simply stated, the birthday problem is the idea that even though the probability of some person $ A $ in a group of 23 people sharing a birthday with one of them is approximately 6%, the probability of any 2 people in the group sharing a birthday is slightly over 50% and reaches 99.9% with a group size of 70. There are already good write-ups, so I won’t elaborate further.

Going back to the birthday attack, it’s now easy to see that if we take random pairs rather than attempting to match some input to a fixed input, the process of finding a collision can be counter-intuitively fast. SHA-1 and MD5 have active exploits that take advantage of hash collision because of their relatively small fixed digest sizes.

Length extension attacks are attacks where an attacker takes the digest of a given message $ m $ and its length to ascertain $ h(m \Vert m’) $ for some attacker determined message $ m’ $. This attack doesn’t require an attacker to know the value of $ m $. Any hash function using Merkle–Damgård construction is susceptible to this attack. This includes MD5, SHA-1, and SHA-2 (excluding truncated versions like SHA-256, SHA-384, and SHA-512). Below is a diagram of a Merkle–Damgård construction

Merkle–Damgård

Recommendations

Argon2 is the gold standard of hashing and salting algorithms. If possible, it’s best to implement Argon2id. Bcrypt still works well, but is less than optimal relative to Argon2. PBKDF2 has the advantage of easy implementation, but is far easier to break on a GPU than something such as bcrypt. Regardless of what you use, implement it well. Any of these three are reasonably secure if a competent system administrator sets them up correctly.