Wikipedia has this to say, which seems to solve that puzzle:
"Flame was signed with a fraudulent certificate purportedly from the Microsoft Enforced Licensing Intermediate PCA certificate authority. The malware authors identified a Microsoft Terminal Server Licensing Service certificate that inadvertently was enabled for code signing and that still used the weak MD5 hashing algorithm, then produced a counterfeit copy of the certificate that they used to sign some components of the malware to make them appear to have originated from Microsoft. A successful collision attack against a certificate was previously demonstrated in 2008, but Flame implemented a new variation of the chosen-prefix collision attack."
Here's the 'real' link: http://en.wikipedia.org/wiki/Flame_%28malware%29
In some systems I've built in the past I employ MD5 as a hashing mechanism to verify firmware integrity after flashing it in the memory. I don't use MD5 for anything security related (this is treated in other ways, depending on the system), just to check transmission and memory integrity.
Is MD5 still considered fine for this, or is there a real risk that random or systematic (but unintentional) noise could generate a collision between corrupted and original data? I do believe it should suffice, but hearing all the badmouth makes me wonder...
Not saying that MD5 is a good choice in this case, just that we may be blaming the wrong thing.
The MD5 algorithm is known to lack collision resistance, but whether it has preimage resistance is less certain; mathematical advances have weakened its preimage resistance, but not yet to the point of demonstrating a practical preimage attack.
False negatives would be more of an issue if the anti-virus has white lists and one can manufacture a Microsoft Excel MD5 signature with a malware. But that's not what the article refers to.
MD5 is only broken if you want to use it as a non-reversible hashing algorithm or if you want to use it as a an unforgeable signature. But it's perfectly fine for many other usage.
As you can see, binaries submitted for analysis are
identified by their MD5 sums and no sandboxed execution is
recorded if there is a duplicate (thus the shorter time
delay). This means that if I can create two files with the
same MD5 sum – one that behaves in a malicious way while the
other doesn’t – I can “poison” the database of the product
so that it won’t even try to analyze the malicious sample!
So it's a technique to get the scanner to ignore a malicious binary by constructing a non-malicious one with the same MD5 sum. This would be much harder if the scanner used a SHA-1 hash or similar.0. http://en.wikipedia.org/wiki/Preimage_attack
1. 2^123.4 complexity is not practical