The issue for xz was that the build system was not hermetic (and sufficiently audited).
Hermitic build environments that can’t fetch random assets are a pain to maintain in this era, but are pretty crucial in stopping an attack of this kind. The other way is reproducible binaries, which is also very difficult.
EDIT: Well either I responded to the wrong comment or this comment was entirely changed. I was replying to a comment that said. “The issue was that people used pre-built binaries” which is materially different to what the parent now says, though they rhyme.
(The ostensibly autotools-built files in the tarball did not correspond to the source repository, admittedly, but that’s another question, and I’m of two minds about that one. I know that’s not a popular take, but I believe Autotools has a point with its approach to source distributions.)
the exploit used the only solution for this problem: binary test payload. there's no other way to do it.
maybe including the source to those versions and all the build stuff to then create them programmatically... or maybe even a second repo that generates signed payloads etc... but its all overkill and would have failed human attention as the attack proved to begin with.
Ideally a test env and a build env should be entirely isolated should the test code some how modify the source. Which in this case it did.
But should we trust it? No!! That's why we're here!
I'm not satisfied with the author's double-standard-conclusion. Trust, but verify does not have some kind of hall pass for OSS "because open-source is clearly better."
Trust, but verify is independent of the license the coders choose.
And certainly a condition of the "verify" step?
With closed-source software, you can (almost) _only_ trust.
The XZ backdoor was possible because people stick generated code (autoconf's output), which is totally impractical to audit, into the source tarballs.
In nixpkgs, all you have to do is add `autoreconfHook` to the `nativeBuildInputs` and all that stuff gets regenerated at build time. Sadly this is not the default behavior yet.
It was a pure fluke that it got discovered _this early_.
They weren't the ones to find the cause first (that's the person who took a deeper look due to the slowness), but the red flags had been raised.
This bends my brain a little. I get that they were written before git, but not before the advent of version control.
The packages uploaded in Debian are what matters and they are versioned.
The easiest way to verify that is by using a reproducible automated pipeline, as that moves the problem to "were the packaging files tampered with".
How do you verify the packaging files? By making them auditable by putting them in a git repository, and for example having the packager sign each commit. If a suspicious commit slips in, it'll be immediately obvious to anyone looking at the logs.
git clone https://git.savannah.gnu.org/git/bash.git
git clone https://git.savannah.gnu.org/git/coreutils.git
Plug the repo name into https://savannah.gnu.org/git/?group=<REPO_NAME> to get a link to browse the repo. 2025-07-03 Bash-5.3 distribution sources and documentation bash-5.3 Chet Ramey 896 -103357/+174007you can of course come up with ways it could have been caught, but the code doesn't stand out as abnormal in context. that's all that really matters, unless your build system is already rigid enough to prevent it, and has no exploitable flaws you don't know about.
finding a technical overview is annoyingly tricky, given all the non-technical blogspam after it, but e.g. https://securelist.com/xz-backdoor-story-part-1/112354/ looks pretty good from a skim.
There's no good reason to have opaque, non generated data in the repository and it should certainly be a red flag going forwards.
1. Build environments may not be adequately sandboxed. Some distributions are better than others (Gentoo being an example of a better approach). The idea is that the package specification specifies the full list of files to be downloaded initially into a sandboxed build environment, and scripts in that build environment when executed are not able to then access any network interfaces, filesystem locations outside the build environment, etc. Even within a build of a particular software package, more advanced sandboxing may segregate test suite resources from code that is built so that a compromise of the test suite can't impact built executables, or compromised documentation resources can't be accessed during build or eventual execution of the software.
2. The open source community as a whole (but ultimately in the hands of distribution package maintainers) are not being alerted to and apply caution for unverified high entropy in source repositories. Similar in concept to nothing-up-my-sleeve numbers.[1] Typical examples of unverified high entropy where a supply chain attack can hide payload: images, videos, archives, PDF documents etc in test suites or bundled with software as documentation and/or general resources (such as splash screens in software). It may also include IVs/example keys in code or code comments, s-boxes or similar matrices or arrays of high entropy data which may not be obvious to human reviewers how the entropy is low (such as a well known AES s-box) rather than high and potentially undifferentiated from attacker shellcode. Ideally when a package maintainer goes to commit a new package or package update, are they alerted to unexplained high entropy information that ends up in the build environment sandbox and required to justify why this is OK?
[1] https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number
A random person or group nobody has ever seen or knows submitted a backdoor.
2. Some people may want to remain pseudonymous for legitimate reasons.
The developers (at least important ones) could register with Debian project, just like they would with a company: submit identity and government documents, proof of physical address, bank account, credit card information, IdP account, .. It would operate like an organization.
The lead developers could meet and know each other through regular meetings. Kind of web of trust with in person verification. There are already online meetings in some projects.
>we can only trust open source software. There is no way to audit closed source software
The ability to audit software is not sufficient, nor neccessary for it to be trustworthy.
>systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can’t audit anything
You can't audit open source vendors either.
Debian is the OS, and the OS vendor should decide and modify the components it uses as a foundation to create the OS as he desires. That's what I am choosing Debian for and not some other OS.
> You can't audit open source vendors either.
What defines open source, is that you can request the sources for audit and modification, so I think this statement is just untrue.
>you can request the sources
Organizarions that open source software can have closed source infrastructure that you can't request.
IIRC, this dependency isn't in upstream OpenSSH.
However, OpenSSH is open source with a non-restrictive license and as such, distributors (including Linux distributions) can modify it and distribute modified copies. Additionally, OpenSSH has a project goal that "Since telnet and rlogin are insecure, all operating systems should ship with support for the SSH protocol included." which encourages OS projects to include their software, with whatever modifications are (or are deemed) necessary.
Debian frequently modifies software it packages, often for better overall integration; ocassionally with negative security consequences. Adding something to OpenSSH to work better with systemd is in both categories, I guess.
You can audit a lot of Debian's infrastructure - their build systems are a lot more transparent than the overwhelming majority of software vendors (which is not to say there isn't still room for improvement). You can also skip their prebuilt packages and build everything on your own systems, which of course you then have the ability to audit.
We have no clue who “Jia Tan” is, a name certain to be a pseudonym. Nobody has seen his face. He never provided ID to a HR department. He pays no taxes to a government that links these transactions to him. There is no way to hold his feet to the fire for misdeeds.
The open source ecosystem of tools and libraries is built by hundreds of thousands of contributors, most of whom are identified by nothing more than an email. Just a string of characters. For all we know, they’re hyper-intelligent aliens subtly corrupting our computer systems, preparing the planet for invasion! I mean… that’s facetious, but seriously… how would we know if it was or wasn’t the case!? We can’t!
We have a scenario where the only protection is peer review: but we’ve seen that fail over and over systematically. Glaring errors get published in science journals all of the time. Not just the XZ attack but also Heartbleed - an innocent error - occurred because of a lack of adequate peer review.
I could waffle on about the psychology of “ownership” and how it mixes badly with anonymity and outside input, but I don’t want this to turn into war and peace.
The point is that the fundamental issue from the “outside” looking in as a potential user is that things go wrong and then the perpetrators can’t be punished so there is virtually no disincentive to try again and again.
Jia Tan is almost certainly a state-sponsored attacker. A paid professional, whose job appears to be to infect open source with back doors. The XZ attack was very much a slow burn, a part time effort. If he’s a full time employee, how may more irons did he have on the fire! Dozens? Hundreds!?
What about his colleagues? Certainly he’s not the one and only such hacker! What about other countries doing the same with their own staff of hackers?
The popular thinking has been that “Microsoft bad, open source good”, but imagine Jia Tan trying to pull something like this off with the source of Windows Server! He’d have to get employed, work in a cubicle farm, and then if caught in the act, evade arrest!
That’s a scary difference.
You're making a distinction not between open source and proprietary software but rather between hobbyist and corporate software.
There are open source projects made by companies with no external contributions allowed (sqlite sorta, most of google and amazon's oss projects in practice etc)
There are proprietary software downloads with no name attached, like practically every keygen, game crack, many indie games posted for free download on forums or 4chan, etc etc.
> hobbyist and corporate software.
OpenSSL was maintained by like two guys in their spare time, and underpinned trillions of dollars worth of systems and secure transfers.
Would you categorise that as “hobbyist”?
The semantics matter, so I’m going to agree with you and clarify that my concern is with the risks associated with “effectively anonymous contributors allowed” software, where personal consequences for bad actors are near zero.
On the Venn diagram of software licenses and source accessibility, this “especially risky” category significantly overlaps FLOSS and has little overlap with most proprietary software products.
I personally had no bias or aversion to FLOSS software for either personal or professional use, but in all seriousness the XZ attack after the Heartbleed vulnerability made me reconsider my priors.
Versus… a random email offers to help, someone says “sure!”, and… that’s it. That’s the entire hurdle.
Google did discover a Chinese hacker working for them on the payroll. That kind of thing does occur, but it’s rare.
It’s massively harder and more risky.
Frankly, tarballs are an embarrassing relic, and it's not the turbonormies that insist they're still fit for purpose. They don't know any better, they'll do what people like you tell them to do.