Is this an "always bad" thing, or does it depend on the structure of the messages and what you require to keep secret?
For example, I've seen a system that provides for storing secret strings in a database that works as follows:
Given plaintext string P, it forms an intermediate string of the form H:P, where H is a cryptographic hash of P. It then encrypts H:P using AES in CBC mode with a fixed IV and key. The resulting ciphertext is stored in a row in the database. The row ID is then used as a token to represent P.
The block size of the hash is the same size as the AES block size.
It seems to me that this is functionally equivalent to this scheme, which does not use a fixed IV:
For plaintext string P, generate an IV by taking a hash of P, xoring that with a fixed constant, and run that through AES in ECB mode using our key. The result will be the IV. Then encrypt the string :P using AES in CBC mode with that IV and our key. Store the result in the database, and also store the IV.
I realize that two identical plaintext messages will result in identical ciphertext, but in this application that is OK. If the system is asked to store a string it has previously stored, we want to recognize that the string is already stored and use the same token to represent it as before.
* You can design around needing a random IV by manipulating keys and message contents; as you note, the IV is a formalism. CBC doesn't simply explode if you repeat an IV.
* I am generally skeeved out by systems that use hashes of messages as parameters. Another common species of custom crypto scheme does a similar trick but uses the hash as a key.
* This also gets into the subtleties of IVs and nonces. You want an IV to be unpredictable, and you want a nonce to be unique. An IV that is the hash of a message is probably predictable (and perhaps testably so). The way CBC tends to fail with IV problems is that it turns back into ECB mode; the way ECB fails is by allowing attackers to carefully pick blocks based on combinations of known and unknown plaintext.
It would depend a lot on the system. I'm also not sure what the win here is in deriving the IV from the message, as opposed to just making it random.
If a random IV were used, then there would have to be some other mechanism to detect attempts to store duplicate strings. Perhaps a table that maps hashes of already stored plaintext strings to the row ID of the encrypted string.
I don't like that, because if someone is able to steal a copy of the database, they have hashes of all the stored strings. I guess that could be dealt with by using HMAC for these hashes instead of just a plain hash.
There is a lot less discussion on the net (at least that I could find) of using IVs derived from the message than I would have expected, since it has to be about the first thing anyone thinks of when they want identical plaintext to produce identical ciphertext. The most direct discussion seems to be in Thomas Pornin's comments on his answer in this stackoverflow discussion: http://stackoverflow.com/questions/4608489/how-to-pick-an-ap...