The bug is simple: like a lot of number-theoretic asymmetric cryptography, the core of ECDSA is algebra on large numbers modulo some prime. Algebra in this setting works for the most part like the algebra you learned in 9th grade; in particular, zero times any algebraic expression is zero. An ECDSA signature is a pair of large numbers (r, s) (r is the x-coordinate of a randomly selected curve point based on the infamous ECDSA nonce; s is the signature proof that combines x, the hash of the message, and the secret key). The bug is that Java 15+ ECDSA accepts (0, 0).
For the same bug in a simpler setting, just consider finite field Diffie Hellman, where we agree on a generator G and a prime P, Alice's secret key is `a mod P` and her public key is `G^a mod P`; I do the same with B. Our shared secret is `A^b mod P` or `B^a mod P`. If Alice (or a MITM) sends 0 (or 0 mod P) in place of A, then they know what the result is regardless of anything else: it's zero. The same bug recurs in SRP (which is sort of a flavor of DH) and protocols like it (but much worse, because Alice is proving that she knows a key and has an incentive to send zero).
The math in ECDSA is more convoluted but not much more; the kernel of ECDSA signature verification is extracting the `r` embedded into `s` and comparing it to the presented `r`; if `r` and `s` are both zero, that comparison will always pass.
It is much easier to mess up asymmetric cryptography than it is to mess up most conventional symmetric cryptography, which is a reason to avoid asymmetric cryptography when you don't absolutely need it. This is a devastating bug that probably affects a lot of different stuff. Thoughts and prayers to the Java ecosystem!
R = SB - Hash(R || A || M) A
Where R and S are the two halves of the signature, A is the public key, and M is the message (and B is the curve's base point). If the signature is zero, the equation reduces to Hash(R || A || M)A = 0, which is always false with a legitimate public key.
And indeed, TweetNaCl does not explicitly check that the signature is not zero. It doesn't need to.
However.
There are still ways to be clever and shoot ourselves in the foot. In particular, there's the temptation to convert the Edwards point to Montgomery, perform the scalar multiplication there, then convert back (doubles the code's speed compared to a naive ladder). Unfortunately, doing that introduces edge cases that weren't there before, that cause the point we get back to be invalid. So invalid in fact that adding it to another point gives us zero half the time or so, causing the verification to succeed even though it should have failed!
(Pro tip: don't bother with that conversion, variable time double scalarmult https://loup-vaillant.fr/tutorials/fast-scalarmult is even faster.)
A pretty subtle error, though with eerily similar consequences. It looked like a beginner-nuclear-boyscout error, but my only negligence there was messing with maths I only partially understood. (A pretty big no-no, but I have learned my lesson since.)
Now if someone could contact the Whycheproof team and get them to fix their front page so people know they have EdDSA test vectors, that would be great. https://github.com/google/wycheproof/pull/79 If I had known about those, the whole debacle could have been avoided. Heck, I bet my hat their ECDSA test vectors could have avoided the present Java vulnerability. They need to be advertised better.
some very popular PKI systems (many CA's) are powered by Java and BouncyCastle ...
Why "infamous"?
It makes ECDSA very brittle, and quite prone to side-channel attacks (since those can get attackers exactly such information.
Except that, of course, people don't actually do unit testing, they're too busy.
Somebody is probably going to mention fuzz testing. But, if you're "too busy" to even write the unit tests for the software you're about to replace, you aren't going to fuzz test it are you?
So, sure.
These aren't alternatives, they're complementary. I appreciate that fuzz testing makes sense over writing unit tests for weird edge cases, but "these parameters can't be zero" isn't an edge case, it's part of the basic design. Here's an example of what X9.62 says:
> If r’ is not an integer in the interval [1, n-1], then reject the signature.
Let's write a unit test to check say, zero here. Can we also use fuzz testing? Sure, why not. But lines like this ought to scream out for a unit test.
Another example is Poly1305. When you look at the test vectors from RFC 8439, you notice that some are specially crafted to trigger overflows that random tests wouldn't stumble upon.
Thus, I would argue that proper testing requires some domain knowledge. Naive fuzz testing is bloody effective but it's not enough.
Certainly, fuzz tests would help us test boundary conditions and more, but they are not a catalogue of known acceptance criteria.
For instance here the keys are going to be around 256 bits in a size, so if your fuzzer is just picking keys at random, your basically never likely to pick zero at random.
With cryptographic primitives you really should be testing all known invalid input parameters for the particular algorithm. A a random fuzzer is not going to know that. Additionally, you should be testing inputs that can cause overflows and are handled correctly ect...
But if you are in a time constrained environment where basic unit tests are skipped fuzz testing will be as well.
Sounds exactly like the kind of disconnected environment that would lead to such bugs.
(That is to say: a Critical Patch Update or a Patch Set Update. Did they really have to overload these TLAs?)
In terms of OpenJDK 17 (latest LTS), the issue is patched in 17.0.3, which was release ~12h ago. Note that official OpenJDK docker images are still on 17.0.2 as of time of writing.
https://github.com/openjdk/jdk/blob/e2f8ce9c3ff4518e070960ba...
>This is why the very first check in the ECDSA verification algorithm is to ensure that r and s are both >= 1. Guess which check Java forgot?
We're running some production services on OpenJDK and CentOS and until now there are only two options to be safe: shutdown the services or change the crypto provider to BouncyCastle or something else.
The official OpenJDK project lists the planned release date of 17.0.3 as April 19th, still the latest available GA release is 17.0.2 (https://wiki.openjdk.java.net/display/JDKUpdates/JDK+17u).
Adoptium have a large banner on their website and until now there is not a single patched release of OpenJDK available from them (https://github.com/adoptium/adoptium/issues/140).
There are no patched packages for CentOS, Debian or openSUSE.
The only available version of OpenJDK 17.0.3 I've seen until now seems to be the Archlinux package (https://archlinux.org/packages/extra/x86_64/jdk17-openjdk/). They obviously have their own build.
How can it be that this is not more of an issue? I honestly don't get how the release process of something as widely used as OpenJDK can take more than 2 days to provide binary packages for something already fixed in the code.
This shouldn't be much more effort than letting the CI do its job.
Edit: Typo.
Unfortunately, I assume that a very common case is just using the distribution provided openjdk-package and configuring the system for auto updates. So the main issue here is that a serious number of systems is relying on the patch process of the distribution to fix issues like this and they are still vulnerable at this moment.
> The official OpenJDK project lists the planned release date of 17.0.3 as April 19th, still the latest available GA release is 17.0.2
> (https://wiki.openjdk.java.net/display/JDKUpdates/JDK+17u).
I don't think there 17.0.3 ever will be available from openjdk.java.net; there's no LTS for upstream builds, and since Java 18 is out already, no further builds of 17 should be expected there. IMO, this warrants some clarification on that site though.
https://adoptopenjdk.net/upstream.html
These are the official upstream builds by the updates project built by Red Hat. Not to be confused by Red Hat Java, not to be confused by the AdoptOpenJDK/Adoptium builds. These can‘t be hosted on openjdk.java.net because they host only builds done by Oracle, not to be confused by Oracle JDK.
On the other hand, the problem that many popular server distributions like CentOS and Debian still haven't updated their Java 17 packages remains and I wonder if this is due to their own package build process or because they are waiting for an upstream process to complete.
If they actually rely on the upstream builds from openjdk.java.net that would mean that the fix will not make it to their repositories at all.
Is there any truth to this? Doesn't basically all Internet traffic rely on the security of (correctly implemented) asymmetric cryptography?
The TLS encryption is of course assumed here, but that is nothing most developers ever really touch in a way that could break it. And arguably this part falls under the "you absolutely need it" exception.
x509 certificates have several revocation mechanisms since having something being marked as "do not use" before the end of its lifetime is well understood. JWTs are not quite there.
Yes, symmetric cryptography is a lot more straightforward and should be preferred where it is easy to use a shared secret.
> Doesn't basically all Internet traffic rely on the security of (correctly implemented) asymmetric cryptography?
It does. This would come under the "unless you absolutely need it" exception.
It's a bad idea (and no one should be doing it) to continue using asymmetric crypto algorithms after that. If someone can get away with a pre-shared (symmetric) key, sometimes/usually even better, depending on the risk profiles.
AES-GCM, you mean. Let's not forget the authentication in "authenticated encryption". I'm nitpicking, but if a beginner comes here it's better to make it clear that in general, encryption alone is not enough. Ciphertext malleability and all that.
of course it would be more secure to have private physical key exchange, but that's not a practical option, so we rely on RSA or whatever
I wouldn't bet on the TLS session you're using to have that kind of half life.
The other side is the protocol itself. Protocols are delicate, and easy to mess up in catastrophic ways. On the other hand, they're also provable. We can devise security reductions that prove that the only way to break the protocol is to break one of its primitives. Such proofs are even mechanically verified with tools like ProVerif and Tamarin.
Maybe TLS is a tad too complex to have the same half life as AES. The Noise protocols however have much less room for simplification. That simplicity makes them rock solid.
"Immediately ditch RSA in favor of EC, for it is too hard to implement safely!"
This specific one was introduced with the rewriting of these parts of the code from C++ to Java, and that happened with Java 15.
Seems like someone likes to live dangerously: using libraries that haven't been updated since 2012 is a pretty risky move, especially given that if an RCE is discovered now, you'll find yourself without too many options to address it, short of migrating over to the new release (which will be worse than having to patch a single dependency in a backwards compatible manner): https://logging.apache.org/log4j/1.2/changes-report.html
Admittedly, i wrote a blog post called "Never update anything" a while back, even if in a slightly absurdist manner: https://blog.kronis.dev/articles/never-update-anything and personally think that frequent updates are a pain to deal with, but personally i'd only advocate for using stable/infrequently updated pieces of software if they're still supported in one way or another.
You do bring up a nice point about the recent influx of vulnerabilities and problems in the Java ecosystem, which i believe is created by the fact that they're moving ahead at a faster speed and are attempting to introduce new language features to stay relevant and make the language more inviting for more developers.
That said, with how many GitHub outages there have been in the past year and how many other pieces of software/services have broken in a variety of ways, i feel like chasing after a more rapid pace of changes and breaking things in the process is an industry wide problem.
I disagree. Some libraries are just rock solid, well tested and long life.
In the case of log4j 1.x vs 2.x, has there been any real motivator to upgrade? There are 2 well known documented vulnerabilities in 1.x that only apply if you use extensions.
Luckily, there's now an alternative: reload4j (https://reload4j.qos.ch/) is a maintained fork of log4j 1.x, so if you were one of the many who stayed on the older log4j 1.x (and there were enough of them that there was sufficient demand for that fork to be created), you can just migrate to that fork (which is AFAIK fully backward compatible).
(And if you do want to migrate away from log4j 1.x, you don't need to migrate to log4j 2.x; you could also migrate to something else like logback.)
This one, I gather, is actually Java's fault.
It sounds like three unrelated security bugs from totally different teams of developers.
Modules are also part of the reason why so many folks got "stuck" on java 8.
It is definitely an interesting study in the challenges of trying to make advances in a platform when a lot of the ecosystem is very much in maintenance mode and may not have a lot of eyes on the combination of existing libraries vs new versions of Java.
At some level, as long as releases add functionality, the basic rules of systemantics will guarantee unintended interactions.