This not only makes sure we don't miss expiration but also ensures we don't forget to configure any of the application.
We had a situation when the cert was replaced but the file was placed in incorrect path and was not actually used by the app. Having the app report on what is actually being in use is the best way to prevent this from ever happening.
After one scrambling emergency with a cert expiring in the middle of the day, a constant check with warnings and alerts a couple of weeks before expiry made a matter of defensive organization into something trivial.
What "should" happen is that no certificate should be issued with an expiration date later than the issuing certificate. Then as the issuing certificate gets closer to expiration, a new one, with a new key pair, should be created and this new certificate should sign subordinate certificates.
It's likely because it was issued 20 years ago. People have been using it for 20 years and no-one realized it was about to stop working.
So, you're saying that "I'm not going to be working here anymore by then... hahahaha" isn't well-chosen?
Source: BTDT 3 times in 7 years, and it was all with "Big Enterprise" grade products.
If I can predict that 20 years into the future I wouldn't be in the SRE business.
But your statement is really pointing out that nobody should be making long-lived certificates.
In other scenarios where one would want to issue fresh certificates (receiving Ops control from other orgs, credentials refresh for what ever reason), one can still do so without waiting for the current certificates to expire.
Don't be part of the death of internet discourse.
The certificate reseller advised my customer that it was okay to include the cross-signing cert in the chain, because browsers will automatically ignore it once it expires, and use the Comodo CA root instead.
And that was true for browsers I guess. But my customer also has about 100 machines in the field that use cURL to access their HTTPS API endpoint. cURL will throw an error if one of the certs in the chain has expired (may be dependent on the order, don't know).
Anyway, 100 machines went down and I had a stressed out customer on the phone.
Earlier this year I added SSL verification to a website uptime monitoring service I run (https://www.watchsumo.com/docs/ssl-tls-monitoring) and it wasn't anywhere near as simple as I thought it would be. There's so many edge cases regarding verification, and languages usually don't expose the full errors in exceptions, then you have errors like this which only affect a subset of clients.
Let me know if I can help with more info.
1. The CA must misissue a cert.
2. The misissued cert is used by a malicious party to impersonate you.
3. Every user (your users) must prove their damages and claim individually.
4. There might have been a low maximum, per-user claim, but I can't remember.
I'd be amazed if there's a single person on the internet who's been paid out by that warranty.
If certificate revocation doesnt work then certs need to expire super frequently to limit potential damage if compromised.
A certificate that expires in 20 years does absolutely nothing for security compared to a certificate that never expires. Odds are that in 20 years the crypto will need to be updated anyways, effectively revoking the certificate.
This is especially true now that we have OCSP stapling. From a security perspective, a short-lived certificate is exactly equivalent to a long-lived certificate with mandatory OCSP stapling and a short-lived OCSP response, but the latter is much more complicated.
And in this case since it's a root, it goes even further than that. Root CA's can't be revoked anyway, so if they're compromised, a software update to distrust it is required. There's really not a good reason for them to expire at all.
Expiration is not “just” about cryptographic risk either; there are plenty of operational risks. If you’re putting your server on the Internet, and exposing a service, you should be worried about key compromise, whether by hacker or by Heartbleed. Lifetimes are a way of expressing, and managing, that risk, especially in a world where revocation has a host of failure modes (operational, legal/political, interoperability) that may not be desirable.
As for Root expiration, it’s definitely more complicated than being black and white. It’s a question about whether software should fail-secure (fail-closed) or fail-insecure (fail-open). The decision to trust a CA, by a software vendor, is in theory backed by a variety of evidence, such as the CA’s policies and practices, as well as additional evidence such as audits. On expiration, under today’s model, all of those requirements largely disappear; the CA is free to do whatever they want with the key. Rejecting expired roots is, in part, a statement that what is secure now can’t be guaranteed as secure in 5 years, or 10 years, or 6 months, whatever the vendor decides. They can choose to let legacy software continue to work, but insecurely, potentially laying the accidental groundwork for the botnets of tomorrow, or they can choose to have legacy software stop working then, on the assumption that if they were receiving software updates, they would have received an update to keep things working / extend the timer.
Ultimately, this is what software engineering is: balancing these tradeoffs, both locally and in the broader ecosystem, to try and find the right balance.
Once the slowdown is too big, someone will notice and have a look.
Certificate expiration means I don't need to worry about that second case.
At the core, this is not a problem with the server, or the CA, but with the clients. However, servers have to deal with broken clients, so it’s easy to point at the server and say it was broken, or to point at the server and say it’s fixed, but that’s not quite the case.
I discussed this some in https://twitter.com/sleevi_/status/1266647545675210753 , as clients need to be prepared to discover and explore alternative certificate paths. Almost every major CA relies on cross-certificates, some even with circular loops (e.g. DigiCert), and clients need to be capable of exploring those certificates and finding what they like. There’s not a single canonical “correct” certificate chain, because of course different clients trust different CAs.
Regardless of your CA, you can still do things to reduce the risk. Using tools like mkbundle in CFSSL (with https://github.com/cloudflare/cfssl_trust ) or https://whatsmychaincert.com/ help configure a chain that will maximize interoperability, even with dumb and old clients.
Of course, using shorter lived certificates, and automating them, also helps prepare your servers, by removing the toil from configuring changes and making sure you pickup updates (to the certificate path) in a timely fashion.
Tools like Censys can be used to explore the certificate graph and visualize the nodes and edges. You’ll see plenty of sites rely on this, and that means clients need to not be lazy in how they verify certificates. Or, alternatively, that root stores should impose more rules on how CAs sign such cross-certificates, to reduce the risk posed to the ecosystem by these events.
Google has a healthy Patch Rewards program ( https://www.google.com/about/appsecurity/patch-rewards/ ) that rewards patches to a variety of Open Source Projects.
Google also finds a variety of projects through the Core Infrastructure Initiative ( https://www.coreinfrastructure.org/ ), which OpenSSL is part of https://www.coreinfrastructure.org/announcements/the-linux-f...
Top offender so far seems to be GnuTLS.
As a general rule of thumb:
1) You don't need to add root certificates to your certificate chain
2) You especially don't need to add expired root certificates to the chain
For additional context and the ability to check using `openssl` what certificates you should modify in your chain, I found this post useful: https://ohdear.app/blog/resolving-the-addtrust-external-ca-r...
If some of your clients don't have the UserTrust CA, but do have the AddTrust CA, up until today, you probably wanted to include the UserTrust CA cert signed by AddTrust. Clients with the UserTrust CA should see that the intermediate cert is signed by UserTrust and not even read that cross signed cert, but many do see the cross signed cert and then make the trust decision based on the AddTrust CA.
It's hard to identify clients in the TLS handshake to give them a cert chain tailored to their individual needs; there's some extensions for CA certs supported, but they're largely unused.
(At Cronitor, we saw about a 10% drop in traffic, presumably from those with outdated bundles)
Since you don't control the clients in anyway, it might be that there are clients that haven't updated their local certificate stores in ages and don't yet trust the new root certificates.
https://status.heroku.com/incidents/2034
Anyone that was already connected was able to continue accessing the sites but new connections failed. This mostly affected web users.
Our main app server continued to crank along thankfully (also on Heroku) and that kept the mobile traffic going which is 90% of our users.
Edit: adding Heroku ticket link
TIL that I can buy a cert that expires in a year that is signed by a root certificate that expires sooner. Still not sure WHY this is the case, but this is definitely the case.
(The reason for the difference being that browser stay up to date, many old client systems do not.)
We ended up getting a new cert from a different provider.
Found it ironic that the top of their page advertises "Security Monitoring now available".
Our org is currently divided over further commitment to the service, or leaving them entirely. They've made it very hard to argue in their favor.
Edit: for https://www.circuitlab.com/ we saw all Stripe webhooks failing from 4:08am through 12:04pm PDT today with "TLS error". Since 12:04pm (5 minutes ago), some webhooks are succeeding and others are still failing.
Edit 2: since 12:17pm all webhooks are succeeding again. Thanks Stripe!
$ lynx -dump https://wiki.factorio.com/Version_history
Looking up wiki.factorio.com
Making HTTPS connection to wiki.factorio.com
SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
Retrying connection without TLS.
Looking up wiki.factorio.com
Making HTTPS connection to wiki.factorio.com
SSL callback:certificate has expired, preverify_ok=0, ssl_okay=0
Alert!: Unable to make secure connection to remote host.
lynx: Can't access startfile https://wiki.factorio.com/Version_historyThe first thing that'll happen is Let's Encrypt's systems will tell systems by default to present certificate chains which don't mention DST Root CA X3. Lots of systems will, as a result, automatically switch to such a chain when renewing and you'll see a gentle trickle of weird bugs over ~90 days starting this summer unless Let's Encrypt moves the date.
Those bugs will be from clients that somehow in 2020 both didn't trust the ISRG root and couldn't imagine their way to using a different trust path not presented by the server. Somebody more expert in crap certificate verification software can probably tell you exactly which programs will fail and how.
Then there will be months of peace in which seemingly everything is now fine.
Then in September 2021 the other shoe drops. Clients that didn't trust ISRG but had managed to cobble together their own trust path to DST Root CA X3 now notice it has expired on services which present a modern chain or no chain at all.
Those sites which deliberately used the legacy DST Root CA X3 chain to buy a few more months of compatibility likewise see errors, but hopefully they at least knew this was coming and are expecting it.
But there are also sites using crappy ACME clients that didn't obey the spec. They've hard coded DST Root CA X3 not because they wanted compatibility at all costs and are prepared for it to end in September, but because they just pasted together whatever seemed to work without obeying the ACME spec and so even though Let's Encrypt's servers have told them not to use that old certificate chain they aren't listening. Those services now mysteriously break too, even in some relatively modern clients that would trust ISRG, because the service is presenting a chain that insists on DSR Root CA X3 and they aren't smart enough to ignore that.
On the upside, lots of Let's Encrypt certs are just to make somebody's web site work, and an ordinary modern web browser has been battle-tested against this crap for years, so it will soldier on.
Caused us some connections issues that required a restart of both our clients and the rabbitmq cluster.
That's quite bad, as I tried to do a clean re-install of jitsi-meet, and now I have no installation at all any more.
While Android 2.3 Gingerbread does not have the modern roots installed and relies on AddTrust, it also does not support TLS 1.2 or 1.3, and is unsupported and labelled obsolete by the vendor.
If the platform doesn’t support modern algorithms (SHA-2, for example) then you will need to speak to that system vendor about updates.
I find things like that really really irritating. Crypto is basically maths, and a very pure form at that, so should be one of the most portable types of software in existence. Computers have been doing maths since before they were machines. Instead, the forced obolescence bandwagon has made companies take this very pure and portable technology and tied it to their platform's versions, using the "security" argument to bait and coerce users into taking other unwanted changes, and possibly replacing hardware that is otherwise functional (and, as mentioned earlier, is perfectly capable of executing the relevant code) along with all the ecological impact that has. Adding new root certificates at least for PCs is rather easy due to their extreme portability, but I wish the same could be said of crypto algorithms/libraries.
I think Chrome for Android did include TLS 1.2 at least, when it was shipping for Gingerbread.
Perhaps a coincidence, but also likely that their cert expired.
https://www.reddit.com/r/linux/comments/gshh70/sectigo_root_...
I think a large portion of online communications have been affected today.
We need to do something.
This does not change the CA paradigm, but removes many operational issues.