28

Is it more secure in practice to use the output of multiple hash algorithms instead of a single one (assuming that the output size is the same)?

By secure, in this context, I mean protecting against both collisions and function reversibility.

One particular example would be using 128 bits from SHA-512 plus 128 bits from MD5 instead of using 256 bits from SHA-512.

My intuition is that, theoretically, this would weaken the function, because there are known attacks against MD5 which, effectively, would reduce the output size of the combined function. However, in practice, is it more difficult to find a collision/inversion for both algorithms at the same time - because it'll involve additional theoretical work as opposed to using existing (or upcoming) published ones?

EDIT:

To clarify, I'm simply asking about the basic properties of irreversibility and collision resistance - not about any particular use of hash functions, such as key derivation from passwords (there's excellent coverage about that in other questions) or using the output as unique identifiers.

Another way to phrase this would be "is a hash function defined as the concatenation of two other hash functions better than those?". Also, I'm (somewhat) aware that, information theoretically, it can't possibly be.

watchowl
  • 5,458
  • 2
  • 23
  • 23
  • "Is using multiple hash algorithms more secure?" you ask. Use them for what purpose?, I ask. – user Mar 16 '15 at 14:03
  • 1
    @MichaelKjörling for the purposes of hash function reversibility and collision resistance only. See my edit. – watchowl Mar 16 '15 at 14:22
  • 2
    It may be noted that the term "concatenate", as applied to hash functions, has two meanings: the concatenation of H1(x) and H2(x) may be defined as either H1(x)+H2(x), or H2(H1(x)). The former is a bad idea, since it weakens what is typically the weakest aspect of a password hash to match the strength of the weaker constituent. The latter, however, may be good. – supercat Mar 16 '15 at 16:52
  • 6
    @supercat H2(H1(x)) is called composition, not concatenation. I've never heard it referred to as concatenation either. – Thomas Mar 17 '15 at 16:12
  • @Thomas: I may be misremembering, or incorrectly associating the term with use of functions in some other context (e.g. concatenating filters can refer to feeding something through one filter and then through another). Sorry if I was mistaken. In any case, composition of hash functions also has some pros and cons. – supercat Mar 17 '15 at 16:30
  • A related question (by me) on our sister site Cryptography Stack Exchange: Guarding against cryptanalytic breakthroughs: combining multiple hash functions – Paŭlo Ebermann Mar 17 '15 at 20:52
  • Rather than combining the two hashes in a manner that weakens either of them, how about using the MD5 hash as salt for the SHA-512 hash? That way the combination is at least as strong as the stronger of the two. –  Mar 17 '15 at 16:07

3 Answers3

27

You don't mention what you're using the hashes for, but it appears likely that your intent is to use if for password verification.

So, the first issue to clear up is the concern that a given algorithm might be found to be susceptible to collisions. Collisions are not a threat to password hashes, and even if an algorithm were susceptible to collisions, that doesn't weaken it for the purpose of password hashing. For password hashing, we're worried about pre-image attacks instead...Being able to find the input that was used to create a given hash.

Today, the primary mechanism to attack password hashes is a dictionary-based brute-force attack. You hashes as many possible inputs you can, as quickly as you can, to determine which input results in the hash you're trying to match. Thus, the most desirable feature of password hashing function is slowness. You want to ensure that it takes as long as possible for an attacker to test the inputs required to find the correct one.

So, now that we understand what we're looking for, let's take a look at your proposal.

If you hash the password twice, with SHA-512 and MD-5, and truncate to 128 bits of each, and concatenate them together, for all intents and purposes, you're giving the attacker two hashes to work with. In many cases, the security of a hash isn't terribly reduced by truncation, as long as the resulting length is reasonable, which 128 bits likely is.

This means that you're taking two fast hash functions (SHA-512 and MD-5 are both far too fast to be effective password hash algorithms) and offering the attacker the opportunity to choose which one they would like to use to crack your hash. In this case, they're certainly choose MD-5, and feed inputs into it, truncate the hash to 128 bits, and match it to the MD-5 portion of your hash, and as soon as they have the match, there's near 100% certainty that they also have the value they need to compute the SHA-512 portion of the hash on a single try.

So, not only does this specific construction not improve security, but it actually rather significantly cripples it instead. While there are other potential constructions that could prove to increase security, it's not a simple matter to get it right, so if you really want a reasonable assurance of security, use a proven construction like bcrypt with a reasonable work factor.

StackzOfZtuff
  • 18,093
  • 1
  • 52
  • 86
Xander
  • 35,796
  • 27
  • 116
  • 144
  • You may want to see the OP's edit to the question in response to my comment on it. – user Mar 16 '15 at 14:24
  • @MichaelKjörling indeed - I don't have any particular intent, and I was hoping to treat it as simply another hash function (I'm aware about proper use of KDFs, and slow hashes). I think you've offered a extremely good point concerning reversibility, though - it's as strong as the weakest of the two original hash functions, within the usual reasonable assumption that the algorithm is public – watchowl Mar 16 '15 at 14:32
  • The claim that a 256-bit hash is no stronger than its 128-bit truncation is interesting, and as it turns out, central to the rest of the answer. Because if finding a preimage of the 128-bit truncation doesn't also find the right preimage, the one that matches the 256-bit version, then finding a preimage of the 128-bit truncation also won't find the input that also satisfies the other hash. – Ben Voigt Mar 16 '15 at 14:57
  • Furthermore, I consider it highly likely that a "constructed" preimage is not going to be the original password... and thus there's no reason to believe that it will pass the other hash. – Ben Voigt Mar 16 '15 at 14:59
  • 1
    @BenVoigt Free free to read the literature on hash truncation. There's quite a bit of it out there, and it's generally accepted that reasonable truncation is safe in most cases. Here's a crypto.SE answer to get you started. – Xander Mar 16 '15 at 15:11
  • @Xander: That particular link only discusses collision resistance, with a passing mention of preimage, but no detail on how the preimage problem is affected by truncation. Anyway, the important question is whether preimage attacks on a hash tend to find the original input, or construct an input with a particular structure (and thus will never reveal an original input that didn't have that structure) – Ben Voigt Mar 16 '15 at 15:16
  • Basically, there's good reason to doubt your claim that "as soon as they have the match, there's near 100% certainty that they also have the value they need to compute the SHA-512 portion of the hash on a single try" – Ben Voigt Mar 16 '15 at 15:17
  • @BenVoigt I think Xander is assuming that it's a user-provided password, and it's a ascii attack - in that particular case, the first found pre-image will almost certainly be the password, because the password is, statistically, the (ascii) shortest. Assuming that in input is uniformly distributed, on the other hand, invalidates this result - is that what you meant? – watchowl Mar 16 '15 at 15:21
  • 1
    @BenVoigt Again, read the supporting materials, and perhaps a bit on how attacks against hash functions work. Specifically, you'll find out that attacks that cause collisions give the attacker the greatest amount of leeway. This is why hash functions that are broken in regards to collisions, are not necessarily broken for pre-image resistance. So, if truncation can safely protection against collisions, it cannot be less safe against pre-image attacks. – Xander Mar 16 '15 at 15:25
  • @goncalopp: Dictionary search is certainly an ASCII attack, and I agree the first found match is likely to be the original passphrase. But we are discussing attacks which avoid the brute-force dictionary approach, by using a computed preimage. This preimage is unlikely to be ASCII, and definitely has structure imposed by the preimage construction process. So it's unable to recover original passwords that lack that structure. Are there MD5 and/or SHA preimage attacks that explicitly return the simplest preimage (for example, the shortest composed of only ASCII characters)? – Ben Voigt Mar 16 '15 at 16:20
  • 1
    @Xander: Sure, but finding a collision on one hash in no way implies a collision on the other. – Ben Voigt Mar 16 '15 at 16:22
  • The danger with concatenating two 128-bit hashes is that while testing random 136-bit strings looking for one whose hash matched the given 128 bits of the first hash would likely yield one which would not match the given 128 bits of the second hash, a plausible password that was found to satisfy the first hash would have a significant likelihood of matching the second. The concatenation would not significantly increase exposure to pre-image attacks, but would reduce the strength against brute-force attacks to that of the weaker algorithm. – supercat Mar 16 '15 at 16:46
  • "I was hoping to treat it as simply another hash function" -- in which case the fact that it's too fast to use on its own for keyword hashing, just like MD5 or SHA-1 or SHA-256 all are, is already accounted for by general consideration of hash functions for that purpose. Once you use many repeats of your hash function I expect MD5's weaknesses become worthless to the attacker, although I might be wrong. – Steve Jessop Mar 17 '15 at 11:07
  • 1
    This answer is oriented toward password hashes, and does not apply well to the question as it stands after edit. – fgrieu Mar 17 '15 at 14:56
  • 2
    @fgrieu While this might not answer the question directly, I think it is a useful addition to your answer, since readers might want to use such a construction for password hashes, and now know not to do so. – Paŭlo Ebermann Mar 17 '15 at 21:00
  • 1
    I find your answer makes a lot of sense, except that MD5 only provides outputs of exactly 128 bits (as far as I'm aware?), so the talk of truncation is at least half wrong. I would normally just edit the answer, except that the rest of your displayed prowess has me worried I've missed something obvious. – Patrick M Mar 18 '15 at 07:55
21

From the standpoint of collision-resistance (finding two colliding messages) and second-preimage-resistance (finding a different message colliding with a given one), the concatenation of multiple hashes is at least as secure as the strongest of the hashes (Proof: for any of the two properties, any attack that breaks the concatenation can be turned into an attack that breaks each hash. For example, hypothetical colliding messages for the 640-bit MD5||SHA-512 also collide for SHA-512, thus if SHA-512 is collision-resistant, then MD5||SHA-512 is collision-resistant).

From the standpoint of finding a message hashing to a given value ([first-]preimage resistance), or finding an unknown (possibly short or low-entropy) message known to hash to a given value (the situation in password hashing), it is prudent to consider that the concatenation of multiple hashes is insecure (possibly much worse than any of the hashes). We can construct hashes so that each is individually resistant to finding an unknown message (chosen randomly in some set too large to explore) from its hash, but the concatenation is totally weak against that.

For many iterated hashes including all Merkle-Damgård hashes (thus MD5, SHA-1, the various SHA-2), using the concept of multicollisions, it can be shown that the concatenation of multiple hashes is not much more secure against collisions than the strongest of the hashes. See Antoine Joux: Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions, in proceeding of Crypto 2004.

The above article strongly suggests that among using 128 bits from SHA-512 plus 128 bits from MD5 (or) using 256 bits from SHA-512, the later is much safer. The only assurance we have about the former is that it is not worse than SHA-512 truncated to 128 bits, which can be attacked in about 264 hashes, when we expect the later to require about 2128 hashes to attack.

fgrieu
  • 1,231
  • 8
  • 19
  • 2
    "the concatenation of multiple hashes is at least as secure as the strongest of the hashes" --> Did you meant "is at the most"? Or you meant that "the concatenation of multiple hashes is at least as secure as the weakest of the hashes"? – Ismael Miguel Mar 17 '15 at 16:56
  • @Ismael Miguel: I did mean at least as secure in the first paragraph. As in: the concatenation of MD5 and (full) SHA-512 is at least as secure as (full) SHA-512, WRT collision-resistance (also second-preimage-resistance). Proof: hypothetical colliding messages for the 640-bit MD5||SHA-512 also collide for SHA-512, thus if SHA-512 is secure, then MD5||SHA-512 is secure. – fgrieu Mar 17 '15 at 17:18
  • A "Yes, I meant that" would be enough. But you are right about that. That should be added to the answer itself. – Ismael Miguel Mar 17 '15 at 20:15
  • @Ismael Miguel: good suggestion, done. – fgrieu Mar 17 '15 at 20:28
  • 1
    This may be technically correct, but for cracking passwords it's highly misleading. Concatenating hashes without any obfuscation is essentially giving an attacker more information and more options. – Patrick M Mar 18 '15 at 07:58
  • @Patrick M: indeed, and that's what I'm saying in the second paragraph. I'll make it clearer that it applies to password hasing (even though the question has been edited to make it clear password hashing is not the focus). – fgrieu Mar 18 '15 at 08:30
12

Referring to the first few paragraphs in this article: https://crackstation.net/hashing-security.htm

If you are thinking of writing your own password hashing code, please don't!

I understand that you don't consider writing your own hashing code, but rather use multiple of standardised ones to seamingly increase security. This will, however, give you no or little advantage.

Ratchet Freak (https://softwareengineering.stackexchange.com/questions/115406/is-it-more-secure-to-hash-a-password-multiple-times) points out that your hash function would only be as secure as your weakest function in the chain. Weaker algorithms are also known to collide more often, and thus potentially increasing the risk of collitions - making your overall security less.

Furthermore, this will as mentioned decrease entropy instead of increasing it.

There is also no reason why you would rather go ahead and implement two different methods and combine them - in practice there will be none or very little gain. 256 bit SHA-512 will be stronger than two merged 128 bit hashes - since the entropy is increased and overall stronger. MD5 will contain a lower entropy and therefore decreasing the security.

EDIT

To clarify the answer:

SHA-512 is stronger than MD5 (for the bit count that was mentioned, 256 and 128) for the mentioned case. MD5 is more likely to result in collisions. (https://stackoverflow.com/questions/2117732/reasons-why-sha512-is-superior-to-md5)

128 bit MD5 and 128 bit SHA512 vill only be as collision-resistant as the weaker of the two - the 128 bit MD5. (https://stackoverflow.com/questions/7407835/512-bit-hash-vs-4-128bit-hash)

EDIT

As pointed out by fgrieu down in the comments, this turns out to be not true.

As an illustration, we can find new MD5 collisions at high rate, but there is no known collision for SHA512 truncated to 128 bits, thus no known collision for the 256-bit concatenation of MD5 and SHA512 truncated to 128 bits. -fgrieu

The only way I could see this being beneficial is protection against Rainbow Tables, which probably wouldn't be prepared for your hash concentration. Instead of concentrating the hashes, salts should be used. But since you were not interested in the password security side of things ("I don't have any particular intent, and I was hoping to treat it as simply another hash function")

Points brought up in the comments.

  • To reduce collisions, concatenation is better (because you need collisions on both hashes simultaneously). To reduce preimage attacks, chaining is better (because you need to reverse both hashes in sequence). - Ben Voigt
Alex
  • 500
  • 3
  • 14
  • The answer you linked to concerns function chaining, not concatenation, and only addresses collision resistance. "256 bit SHA-512 will be stronger than two merged 128 bit hashes" was the answer I was hoping for, but do you have a reference you're drawing from? – watchowl Mar 16 '15 at 14:27
  • @BenVoigt: Even if one had a preimage attack that could, for any has value, quickly yield a 40-byte string whose hash would match it, I see no way one could use that to crack the described scheme. I see it as being excessively-susceptible to brute-force attacks (and would recommend against it for that reason), but I don't see how the ability to generate preimages for one of the hashes would buy an attacker anything. – supercat Mar 16 '15 at 17:04
  • @supercat: That's exactly the point I've been trying to make in comments to Xander's answer. – Ben Voigt Mar 16 '15 at 17:06
  • The answer has been updated to improve the main point and sources has been added. I also included @BenVoigt's first comment since it was something I did not think about the first time. – Alex Mar 16 '15 at 18:26
  • @supercat: a break of MD5 might still buy you a little bit, though. If you have an MD5 pre-image attack that generates lots of pre-images, then the questioner's scheme might compare poorly to SHA-512 alone, since he's replaced the problem of finding a pre-image to a 256-bit SHA-512 with the problem of finding a pre-image to a 128-bit partial SHA-512 from among the outputs of your MD5-cracker. This may or may not be easier, I don't think it's certain either way. If no attacks are known on SHA-512 then it reduces the brute force search, but 128 bits is still large :-) – Steve Jessop Mar 17 '15 at 11:11
  • In the article you link one of the answers says 'The only sure thing about the concatenation of four hash functions is that the result will be no less collision-resistant than the strongest of the four hash functions.' That seems intuitively obvious to me but unless I misunderstand something, you're claiming that isn't true. – richardb Mar 17 '15 at 13:59
  • @richardb I'm not at all arguing whether or not that is true, but op mentioned using two parts, one sha512 and one MD5 one, if I didn't misread the answer you're referring to it mentioned hashes with the same function, a concatenation of two sha512 hashes for example. The main point I wanted to establish however is that bringing another algo like MD5 could reduce entropy and therefore making it less secure for some purposes. The pre-image attacks mentioned above is out of my knowledge so I won't discuss them. – Alex Mar 17 '15 at 14:04
  • The "128 bit MD5 and 128 bit SHA512 will only be as collision-resistant as the weaker of the two" part of this answer is wrong. As an illustration, we can find new MD5 collisions at high rate, but there is no known collision for SHA512 truncated to 128 bits, thus no known collision for the 256-bit concatenation of MD5 and SHA512 truncated to 128 bits. – fgrieu Mar 17 '15 at 15:10
  • @fgrieu The answer has been updated to remove the false piece of information that I had found. – Alex Mar 17 '15 at 16:11