密码安全：什么是加盐和加胡椒？

1. Introduction

In this tutorial, we’ll examine the Salt and Pepper concept used in password hashing. We’ll learn what they are, how to use them, and their benefits.

As always, with secure password storage, we shouldn’t attempt to implement our algorithm. Standard algorithms exist, are heavily tested, and support everything we need already. These will already have support for Salt and Pepper built into them as appropriate, so we don’t ever need to implement these ourselves on top of what’s already provided.

2. Password Hashing

When storing passwords, there are two important requirements that we need to take into mind:

We can prove that a given password matches
An attacker who compromises the stored passwords can’t determine what the passwords are

In order to achieve this, we store the passwords as cryptographic hashes. These are mathematical functions such that the same input will always provide the same output, but it’s impossible to determine the input based solely on the output.

As long as it’s impossible to reverse the hash – that is, determine the original password based only on the hashed value – then an attacker can’t determine the original password if they’ve only got the hashes. As such, to determine if a given password is correct, we need to generate the same hash again and then compare the hashes – if they match, then the original inputs must have been the same.

For example, if we use the (insecure) MD5 hashing algorithm, the password “Baeldung” will be stored as “d58b26d9f7600672080ebf3851115e45“. Every time we hash this password, we’ll get the same output, but if the input changes slightly, we get a totally different output.

2.1. Brute Force Attacks

If we can’t reverse the hashing algorithm, an attacker who has compromised the list of hashed passwords only has one way to determine the original passwords—brute force. This involves iterating through every possible password, generating the hash for it, and comparing it to the target hash. We’ll determine the original password when we find the corresponding hash.

For example, if we know that we’re using an MD5 hash and that the passwords are 8 characters, then we might brute force as follows:

“aaaaaaaa” = “3dbe00a167653a1aaee01d93e77e730e” = No match
“aaaaaaab” = “2125ea8b81bc0ab7a16e47ca82c06735“= No match
…..
“Baeldunf” = “3d0e01e3f9d9562e48fa87ec1866eb2a” = No match
“Baeldung” = “d58b26d9f7600672080ebf3851115e45” = Match

We’ll work out the original password once we’ve generated the same hash.

The time that it takes to brute force a password depends on many factors, including the hashing algorithm and the complexity of the password. For example, Hive Systems recently published Passwords Table suggests that brute forcing a password could take between seconds and billions of years, depending on the length of the password and the set of possible characters used.

2.2. Rainbow Tables

The limiting factor in how quickly a password can be brute-forced is generating the hashes. Password hashing algorithms are deliberately designed to be expensive to run, so brute-force attacks are a slow process.

However, we can mitigate this by generating all the possible hashes ahead of time. If we know every possible password – based on the hashing algorithm, lengths, and possible characters – then we can calculate every possible hash ahead of time. This is known as a Rainbow Table.

These tables are large, but manageable on modern computers. For example, a table of every possible 8-character alphanumeric password will contain 218,340,105,584,896 rows. Whilst this is a very large number, it’s not out of limits for a large enough database to manage this number of rows.

Once we’ve done this, we can reverse the hashing algorithm simply by looking up the desired hash in this table, and we’ll have the original password. This means that we can crack even the most complicated passwords in seconds, instead of years.

If we can attack a single password with these techniques, we can do exactly the same to attack multiple passwords. Given that the same input will produce the same hash, we know that if we have multiple occurrences of the same hash, then they must correspond to the same input password. We also know that if we’ve determined the input for one of those, then we’ve actually determined it for all of them.

3. Salting Passwords

The main factor making rainbow tables an effective attack is that the same input always produces the same hash. To protect against this, we need some way to cause the same password to produce different hashes while still allowing us to compare hashes when verifying passwords.

Salting is a technique that can be used for this. This involves generating an additional, random value – known as a Salt – to combine with the input password before hashing it. Every stored password will have a different Salt value. This will then mean that the exact same passwords are assigned different salts, and therefore generate different hashes:

algorithm HashPassword(password):
    salt <- generateRandomString()
    result <- hash(password + salt)

    return (result, salt)

We’ll then need to store both the hashed password and the salt in our database. For example:

Password = “Baeldung“, Salt = “123“, Hash = “f78bc83644f9ae7b45e9ad936f1c70f7“
Password = “Baeldung“, Salt = “987“, Hash = “586eccfc7ca25e77823556491021a6d7“

Here, we can clearly see that the exact same password with different salts produces totally different hashes.

We can then verify the password by combining the provided password with the stored salt, generating a hash for this, and comparing it with the stored hash:

algorithm VerifyPasword(password, storedSalt, storedHash):
    newHash <- hash(password + storedSalt)
    return newHash = storedHash

Since every stored password will have a different salt, this means that the same password will have a different hash. In particular, this also means that rainbow tables are significantly less effective. We’d need to have a table with every possible password + salt combination. With a large enough salt, the rainbow table becomes impossibly large – a 64-bit salt means that our rainbow table would need 1.845 * 10^1^9 rows for every single password, meaning that a rainbow table of every possible 8-character alphanumeric password now needs 4.028* 10^3^3 rows.

4. Peppering Passwords

Salting passwords is effective against rainbow table attacks, but it still doesn’t stop attackers from brute-forcing our passwords. If an attacker compromises our password database, they’ll have access to both the hashes and salts, so they can still brute-force attack exactly as before.

Peppering passwords is the same principle, except that we use a single shared secret to combine with the password. This means that we don’t need to store the value alongside the passwords and instead can keep it secured separately – e.g. only within the application source code.

Since we’d be combining every password with the same pepper value, the same passwords will still result in the same hashes. As such, it’s recommended to salt the passwords as well to help with that issue.

const pepper <- "someRandomValue"

algorithm HashPassword(password):
    salt <- generateRandomString()
    result <- hash(password + pepper + salt)
    return (result, salt)

Combining the password with the pepper value means that even if the attacker has the hash and the salt, they still won’t have enough information to be able to easily get the original password back out. As before, the larger the pepper value, the more values the attacker will need to attempt before they can successfully get the same hash, and even then, they’d need some way to determine which parts of the input were the password and which parts were the pepper.

For example, if the attacker knew the hash was “a7e27d2fc90a1ddd5a102f2bcb5b9d4a” and that the salt value was “123“, they may eventually determine that the value that was hashed was “Baeldung987123“. They can then remove the salt from the end, to be left with “Baeldung987“. But, at this point, they don’t know which portions of that string are the password and which are the pepper.

If the pepper used was large enough—e.g., a 64-bit value—then this would introduce enough complexity into the password that brute-forcing it would take an unreasonable amount of time. This, in turn, means that the attacker may never successfully determine the input that generated the hash that they’ve got.

However, because of how they work, pepper values are much more sensitive. It’s impossible to change them without forcing every user to reset their password, and if they’re compromised, then the attacker knows the value that was used for every password in the database.

Peppering passwords is also a less common practice, and so the hashing algorithm used might not support it. For example, Argon2 does have full support for both salting and peppering passwords, whereas algorithms such as PBKDF2, BCrypt and SCrypt don’t. This means that if we need to use one of those, then we can’t also use pepper in our hashing.

5. Conclusion

In this article, we’ve examined salting and peppering passwords, how they work, and the benefits of using them. The next time you need to securely store passwords, why not investigate whether these will work for you?

Persistence

REST

Security