> By the time the comparison was html5 vs. XHTML2 instead of HTML 4 vs XHTML1
The relevant comparison is html5 in its HTML serialization vs html5 in its XML serialization. The latter works in every single browser, and has since IE9 shipped in 2011. No one uses it.
> If you know of a single major site (not somebody's little side project) that uses the XHTML mime type
There aren't any, because I suspect people building such sites all discovered the same thing: ensuring well-formedness is _hard_ in practice, and if it's required for the page to be shown at all, then your page will fail to be shown every so often. And no one wants to deal with that.
Back when some people were in fact trying to use XHTML on the web, every so often you'd run into this on some site that sent XHTML based on "Accept" headers. You'd load the site in Mozilla (suite, then Firefox when it came into being) and get an XML parsing error.
There were two common sources of this problem. First, someone editing a template and forgetting to modify closing tags to match opening ones. This can be solved with server-side enforcement of template well-formedness, of course. But it means you can't have your start and end tags in different parts of the template or different templates, which people wanted to do.
Second, insertion of content you don't control, whether it's user-contributed, or coming from some other team (e.g. content-production team on a news site feeding their bits into the CMS templates), or coming via a content provider like the AP or whatnot. You can mitigate this by using a fully DOM-based workflow, serializing before you put on the wire, instead of pasting together strings. But now you have the problem of producing a DOM from whatever non-well-formed garbage you were handed. Yes, you can just reject non-well-formed input, but if you have no leverage over the producer of that input, that just means you can't do your job. OK, so maybe you have a more liberal parser on the input end and then ensure everything internally operates on trees, not text.
But the upshot in the end is that you end up with a lot more effort and the benefits are not entirely obvious (at least not entirely obvious to your management; there are certainly obvious anti-XSS benefits to having good control of what tokens end up in your output and where escaping happens, etc). So the path of least resistance is to just not go there in terms of the XHTML serialization of HTML.
> The fact that the html5 spec does not permit self-closing CDATA elements
I'm not sure why "CDATA element" is important here. You'd want self-closing <style> and <script> but not self-closing anything else? The idea doesn't even make sense for <style>, so presumably you just want self-closing <script>?