The issue for xz was that the build system was not hermetic (and sufficiently audited).
Hermitic build environments that can’t fetch random assets are a pain to maintain in this era, but are pretty crucial in stopping an attack of this kind. The other way is reproducible binaries, which is also very difficult.
EDIT: Well either I responded to the wrong comment or this comment was entirely changed. I was replying to a comment that said. “The issue was that people used pre-built binaries” which is materially different to what the parent now says, though they rhyme.
However, for the sake of devil's advocacy, I do also want to point out that the first thing a lot of people used to do after downloading and extracting a source tarball was to run "./configure" without even looking at what it is they were executing - even people who (rightly) hate the "curl | bash" combo. You could be running anything.
Being able to verify what it is you're running is vitally important, but in the end it only makes a difference if people take the time to do so. (And running "./configure --help" doesn't count.)
Setup a mirror of all the repositories you care about; then configure the network so your build system can reach the mirrors; but not the general Internet.
Of course, once you do this, you eventually create a cron job on mirrors to blindly update themselves...
This setup does at least prevent an old version of a dependency from silently changing, so projects that pin their dependencies can be confident in that. But even in those cases, you end up with a periodic "update all dependencies" ticket, that just blindly takes the new version.
I think your phrasing is a bit overbroad. There's nothing fundamentally broken with the build system fetching resources; what's broken is not verifying what it's fetching. Audit the package beforehand and have your build system verify its integrity after downloading, and you're fine.
(The ostensibly autotools-built files in the tarball did not correspond to the source repository, admittedly, but that’s another question, and I’m of two minds about that one. I know that’s not a popular take, but I believe Autotools has a point with its approach to source distributions.)
the exploit used the only solution for this problem: binary test payload. there's no other way to do it.
maybe including the source to those versions and all the build stuff to then create them programmatically... or maybe even a second repo that generates signed payloads etc... but its all overkill and would have failed human attention as the attack proved to begin with.
Ideally a test env and a build env should be entirely isolated should the test code some how modify the source. Which in this case it did.
But should we trust it? No!! That's why we're here!
I'm not satisfied with the author's double-standard-conclusion. Trust, but verify does not have some kind of hall pass for OSS "because open-source is clearly better."
Trust, but verify is independent of the license the coders choose.
And certainly a condition of the "verify" step?
With closed-source software, you can (almost) _only_ trust.
The XZ backdoor was possible because people stick generated code (autoconf's output), which is totally impractical to audit, into the source tarballs.
In nixpkgs, all you have to do is add `autoreconfHook` to the `nativeBuildInputs` and all that stuff gets regenerated at build time. Sadly this is not the default behavior yet.
It was a pure fluke that it got discovered _this early_.
They weren't the ones to find the cause first (that's the person who took a deeper look due to the slowness), but the red flags had been raised.
The error was related to the use of the frame pointer. Optimised code does not use RBP as the frame pointer, only using RSP for stack addresses. The XZ backdoor code assumed that the stack used this layout. The RedHat regression tests use debug builds that do use the frame pointer. The result was the backdoor code writing below the bottom of the stack.
I suspect also that Valgrind is unique in finding issues like this. Other tools do not check all memory accesses before main. Valgrind loads and runs the test binary from the very beginning and thus it detected errors in the ifunc code used by XZ that executed very early on during ld.so loading and symbol resolution.
This bends my brain a little. I get that they were written before git, but not before the advent of version control.
The packages uploaded in Debian are what matters and they are versioned.
The easiest way to verify that is by using a reproducible automated pipeline, as that moves the problem to "were the packaging files tampered with".
How do you verify the packaging files? By making them auditable by putting them in a git repository, and for example having the packager sign each commit. If a suspicious commit slips in, it'll be immediately obvious to anyone looking at the logs.
Conversely, this is also an attack surface. It can be easy to just hit "accept" on automated pipeline updates.
New source for bash? Seems legit ... and the source built ... "yeah, ok."
Distros do not need to update packages on each and every upstream commit.
git clone https://git.savannah.gnu.org/git/bash.git
git clone https://git.savannah.gnu.org/git/coreutils.git
Plug the repo name into https://savannah.gnu.org/git/?group=<REPO_NAME> to get a link to browse the repo.If that quote's about keeping Debian packaging in source control, I don't really see much benefit for packages like coreutils and bash that generally Just Work(TM) because they're high-quality and well-tested. Sign what you package up so you can detect tampering, but I don't see you really needing anything else.
2025-07-03 Bash-5.3 distribution sources and documentation bash-5.3 Chet Ramey 896 -103357/+174007Here are the headlines for a couple of fix commits:
Bash-5.2 patch 12: fixes for compat mode leaving extglob enabled after command substitution
Bash-5.2 patch 1: fix crash with unset arrays in arithmetic contexts
It looks like discussion of the patches happens on the mailing list, which is easy to access from the page that brought you to the repo browser.you can of course come up with ways it could have been caught, but the code doesn't stand out as abnormal in context. that's all that really matters, unless your build system is already rigid enough to prevent it, and has no exploitable flaws you don't know about.
finding a technical overview is annoyingly tricky, given all the non-technical blogspam after it, but e.g. https://securelist.com/xz-backdoor-story-part-1/112354/ looks pretty good from a skim.
There's no good reason to have opaque, non generated data in the repository and it should certainly be a red flag going forwards.
1. Build environments may not be adequately sandboxed. Some distributions are better than others (Gentoo being an example of a better approach). The idea is that the package specification specifies the full list of files to be downloaded initially into a sandboxed build environment, and scripts in that build environment when executed are not able to then access any network interfaces, filesystem locations outside the build environment, etc. Even within a build of a particular software package, more advanced sandboxing may segregate test suite resources from code that is built so that a compromise of the test suite can't impact built executables, or compromised documentation resources can't be accessed during build or eventual execution of the software.
2. The open source community as a whole (but ultimately in the hands of distribution package maintainers) are not being alerted to and apply caution for unverified high entropy in source repositories. Similar in concept to nothing-up-my-sleeve numbers.[1] Typical examples of unverified high entropy where a supply chain attack can hide payload: images, videos, archives, PDF documents etc in test suites or bundled with software as documentation and/or general resources (such as splash screens in software). It may also include IVs/example keys in code or code comments, s-boxes or similar matrices or arrays of high entropy data which may not be obvious to human reviewers how the entropy is low (such as a well known AES s-box) rather than high and potentially undifferentiated from attacker shellcode. Ideally when a package maintainer goes to commit a new package or package update, are they alerted to unexplained high entropy information that ends up in the build environment sandbox and required to justify why this is OK?
[1] https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number
A random person or group nobody has ever seen or knows submitted a backdoor.
2. Some people may want to remain pseudonymous for legitimate reasons.
The developers (at least important ones) could register with Debian project, just like they would with a company: submit identity and government documents, proof of physical address, bank account, credit card information, IdP account, .. It would operate like an organization.
The lead developers could meet and know each other through regular meetings. Kind of web of trust with in person verification. There are already online meetings in some projects.
>we can only trust open source software. There is no way to audit closed source software
The ability to audit software is not sufficient, nor neccessary for it to be trustworthy.
>systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can’t audit anything
You can't audit open source vendors either.
Debian is the OS, and the OS vendor should decide and modify the components it uses as a foundation to create the OS as he desires. That's what I am choosing Debian for and not some other OS.
> You can't audit open source vendors either.
What defines open source, is that you can request the sources for audit and modification, so I think this statement is just untrue.
>you can request the sources
Organizarions that open source software can have closed source infrastructure that you can't request.
> Organizarions that open source software can have closed source infrastructure that you can't request.
Which can't be a source for the program binaries, so you can still audit them, you just can't rely on e.g. their proprietary test suite.
IIRC, this dependency isn't in upstream OpenSSH.
However, OpenSSH is open source with a non-restrictive license and as such, distributors (including Linux distributions) can modify it and distribute modified copies. Additionally, OpenSSH has a project goal that "Since telnet and rlogin are insecure, all operating systems should ship with support for the SSH protocol included." which encourages OS projects to include their software, with whatever modifications are (or are deemed) necessary.
Debian frequently modifies software it packages, often for better overall integration; ocassionally with negative security consequences. Adding something to OpenSSH to work better with systemd is in both categories, I guess.
You can audit a lot of Debian's infrastructure - their build systems are a lot more transparent than the overwhelming majority of software vendors (which is not to say there isn't still room for improvement). You can also skip their prebuilt packages and build everything on your own systems, which of course you then have the ability to audit.
We have no clue who “Jia Tan” is, a name certain to be a pseudonym. Nobody has seen his face. He never provided ID to a HR department. He pays no taxes to a government that links these transactions to him. There is no way to hold his feet to the fire for misdeeds.
The open source ecosystem of tools and libraries is built by hundreds of thousands of contributors, most of whom are identified by nothing more than an email. Just a string of characters. For all we know, they’re hyper-intelligent aliens subtly corrupting our computer systems, preparing the planet for invasion! I mean… that’s facetious, but seriously… how would we know if it was or wasn’t the case!? We can’t!
We have a scenario where the only protection is peer review: but we’ve seen that fail over and over systematically. Glaring errors get published in science journals all of the time. Not just the XZ attack but also Heartbleed - an innocent error - occurred because of a lack of adequate peer review.
I could waffle on about the psychology of “ownership” and how it mixes badly with anonymity and outside input, but I don’t want this to turn into war and peace.
The point is that the fundamental issue from the “outside” looking in as a potential user is that things go wrong and then the perpetrators can’t be punished so there is virtually no disincentive to try again and again.
Jia Tan is almost certainly a state-sponsored attacker. A paid professional, whose job appears to be to infect open source with back doors. The XZ attack was very much a slow burn, a part time effort. If he’s a full time employee, how may more irons did he have on the fire! Dozens? Hundreds!?
What about his colleagues? Certainly he’s not the one and only such hacker! What about other countries doing the same with their own staff of hackers?
The popular thinking has been that “Microsoft bad, open source good”, but imagine Jia Tan trying to pull something like this off with the source of Windows Server! He’d have to get employed, work in a cubicle farm, and then if caught in the act, evade arrest!
That’s a scary difference.
You're making a distinction not between open source and proprietary software but rather between hobbyist and corporate software.
There are open source projects made by companies with no external contributions allowed (sqlite sorta, most of google and amazon's oss projects in practice etc)
There are proprietary software downloads with no name attached, like practically every keygen, game crack, many indie games posted for free download on forums or 4chan, etc etc.
> hobbyist and corporate software.
OpenSSL was maintained by like two guys in their spare time, and underpinned trillions of dollars worth of systems and secure transfers.
Would you categorise that as “hobbyist”?
The semantics matter, so I’m going to agree with you and clarify that my concern is with the risks associated with “effectively anonymous contributors allowed” software, where personal consequences for bad actors are near zero.
On the Venn diagram of software licenses and source accessibility, this “especially risky” category significantly overlaps FLOSS and has little overlap with most proprietary software products.
I personally had no bias or aversion to FLOSS software for either personal or professional use, but in all seriousness the XZ attack after the Heartbleed vulnerability made me reconsider my priors.
You pay for nginx plus? Oops, that uses openssl. F5 load balancers since you want to get even more proprietary and expensive? Some of those used OpenSSL too.
Microsoft IIS? Lemme tell you about the history of absolutely bafflingly bad vulnerabilities in that software, far worse than open source nginx ever had.
Effectively anonymous contributions are not what caused heartbleed, they're not what caused the vast majority of breaches and hacks into proprietary software companies nor the vast majority of vulnerabilities.
Bad code is what causes these bugs, and as far as I can tell, the easiest recipe to bad vulnerable code is to have a manager repeatedly tell an engineer "deliver this by friday or you're fired", which happens much less in free software projects.
I'm just trying to get a coherent idea of what you think the right thing to do here is.
How do I stay secure? What OS do I use that doesn't include a ton of open source components and reviews every line of code that goes into it? As far as I can tell, this has already excluded ChromeOS (based on open source packages, many imported without reading all the LoC), macOS (even worse, and an even greater history of vulnerabilities)... I guess windows is the best by this standard? But statistically it's also the most vulnerable, so it doesn't seem like this standard has gotten us to a logical conclusion, does it?
Also, Windows is just suspicious in general. It's slow, everything makes network requests. Finding malware in Windows is a needle in a haystack. For some perspectives, Its all malware.
Versus… a random email offers to help, someone says “sure!”, and… that’s it. That’s the entire hurdle.
Google did discover a Chinese hacker working for them on the payroll. That kind of thing does occur, but it’s rare.
It’s massively harder and more risky.
There's no knowing how many backdoors were added by small network companies or contractors. But there's rarely accountability when it happens because the company would rather cover it up, or just not ask too many questions about that weird bug
Possibly for any number of reasons. A sole maintainer with a bit too little capacity to keep up the development. A central role as a dependency for crucial packages in a couple of key distros.
What would be the connection between the backdoor (or indeed any supply chain security) and any design details of the xz file format? How would the backdoor have been avoided if the archive format were different?
Frankly, tarballs are an embarrassing relic, and it's not the turbonormies that insist they're still fit for purpose. They don't know any better, they'll do what people like you tell them to do.