Dozens of amateur and professional cryptographers signed up last week for the United States' first open competition to create a secure algorithm for generating hashes - the digital fingerprints widely used in a variety of security functions. The contest, run by the National Institute of Standards and Technology (NIST), seeks to …
Is it really much of a problem?
It's been shown mathematically that it's possible to produce two documents with the same hash, but how much of a problem is this in practice? What are the chances of producing a collision document that is meaningful?
"an attacker could, for example, create a modified version of a contract that appears to match - according to the hash - the original digitally-signed document."
If one document says "The party of the first part agrees to ..." and the other document says "N"(*djWf)2C4fG4&*TR(^J)£MLlmPaf03eTwr8M497km" is this really a problem?
I have got one that doesn't crash
Just encrypt the file and use that as the key!
Come on these crashes are just mathematically inevitable, if you read the link to these 'practical attacks' in the OP :) and you read to the bottom you will see the statement:
'If data can be added to a file (software update or email message) so that the modified message is intelligible and matches the hash of the original message then the impact would be devastating. Things are nowhere near as serious as that.'
The crash will happen by the sheer definition of hashing (it wasn't originally intended for security). So yeah whilst this is probably a good thing to draw out new ideas, it is not exactly needed.
My hash is secure...
it's in a packet of fishcakes at the back of the freezer!
Mine's the one with the King size Rizlas in the pocket.
@AC Re:I have got one that doesn't crash
Actually no just using an encrypted file won't work - hash function require only the file/data to create the hash, an encryption algorithm need a key. In any case encryption is much much slower than hashing which is not really a good thing if speed is important.
Re: is it really much of a problem?
yes, perhaps a human readable document is a bad example but executables aren't:
Re: is it really much of a problem?
Yes, if the two documents say
"I agree to <something>"
and it's possible to make another document which causes a hash collision that says
"I agree to <something else> <MB or few of generated stuff here>"
and you can hide that generated stuff from view somehow (hidden text in Word, metadata somewhere, etc etc).
Cryptographic hashes are only useful if it's thought to be infeasible to generate a collision. That it will be a decade or so for the recent announcement to bear fruit means that our current algorithms have to survive a fair few years of attempts to break them, so it's only prudent to start looking for SHA-3 or whatever it'll end up being called. As the article mentioned, some fairly widely used hashing algorithms are already showing signs of weakness, and when you get things like long-lived digital certs which rely on these hashes being unbreakable, well that's when it gets worrying.
@ Chris Richards
As understand that link it's saying that an attacker could produce two programs or postcript files with arbitrarily different behaviour but the same hash. But from my reading of it they couldn't produce an arbitrary file with the same hash as somebody else's program or file. So presumably they still couldn't forge an existing document or program.
I had visions of small ziplock bags with a a login attached. Guess that's not what they meant.
If that was some distribution of a open source project in binary format than it's perfectly simple to carry out their scheme (obviously the smaller the project the easier!) and distribute an 'evilised' binary with the same hash.
But obviously now the republicans are (soon to be) out of the whitehouse there is no evil left in the world, so we needed worry :)
Ohhh come on
the md5 attack shown from the link, doesn't actually use a simple natural hello.exe.
Both the hello.exe and the erase.exe are engineered together.
If only the erase.exe was engineered there would be a problem, but that is not the case.
So yeah there is a vector, but it would be a lot of odd things that would have to happen to create it.
And as to encryption :) Well I didn't want to make it too obvious, but if the author signs via his private key, and you use his public key to check. That is a better position to be in, as you really trusting the author, not the code. Combining all these methods is a good idea, add in encryption via your public key, and a quick check of the source code, and your security is on the up again.
As to documents saying one thing, with a character change or three (not) that is part of a good hashing algorithm for security anyhow, it should make a huge change in the resulting hash if only one character is changed, and they all currently exhibit that.
Collisions of course occur because the hash is smaller than the data it is representing, much smaller :). So, it is about permutations and length of hash more often than not..
These competitions are a good thing, but they are more about bringing on the field of security as a whole, than creating the next hashing star.
Re: Is it really much of a problem?
I have two postscript files here, both can be printed, both have the same MD5 checksum. They say something completely different.
The fact that it is possible to engineer two documents with the same checksum is bad news for cryptography (as that web page says). The ability to engineer executables with the same checksum (or linux .rpm files or anything else that can be signed) is also bad news. If you can get inside then you can arrange to have something legitimate and all hunky-dory signed and distribute your malware through other channels. I'm sure you can remember the recent event where this possibility came uncomfortably close.
It's a serious problem.
There are four documents
A and B originally
A, B -> A`, B`
A` and B` after engineer.
The ` docs look engineered, that's the tell tale, you start to distribute A` people know you are up to no good.
Sure there is a vector of attack, but you cannot really rely on hashing alone anyhow, it is just one of many checks to ensure the validity. I would be kind of worried about hunky dory being no one looks at the source code or generates their own compiled code, at that point why bother with engineering the hash, just payload in unchecked code.
Now if you can do A, B -> A, B` there is a problem.