Could the XZ backdoor been detected with better Git/Deb packaging practices? (opens in new tab)

(optimizedbyotto.com)

120 pointsottoke7mo ago110 comments

110 comments

acka7mo ago

I believe the XZ compromise partly stemmed from including binary files in what should have remained a source-only project. From what I remember, well-run projects such as those of the GNU project have always required that all binaries—whether executables or embedded data such as test files—be built directly from source, compiling a purpose-built DSL if necessary. This ensures transparency and reproducibility, both of which might have helped catch the issue earlier.

dijit7mo ago

thats not the issue, there will always be prebuilt binaries (hell, deb/rpm are prebuilt binaries).

The issue for xz was that the build system was not hermetic (and sufficiently audited).

Hermitic build environments that can’t fetch random assets are a pain to maintain in this era, but are pretty crucial in stopping an attack of this kind. The other way is reproducible binaries, which is also very difficult.

EDIT: Well either I responded to the wrong comment or this comment was entirely changed. I was replying to a comment that said. “The issue was that people used pre-built binaries” which is materially different to what the parent now says, though they rhyme.

jacquesm7mo ago

This is not going to be popular: I think the whole idea that a build system just fetches resources from outside of the build environment is fundamentally broken. It invites all kinds of trouble and makes it next to impossible to really achieve stability and to ensure that all code that is in the build has been verified. Because after you've done it four times the fifth time you won't be looking closely. But if you don't do it automatically but only when you actually need it you will be looking a lot more sharpish at what has changed since you last pulled in the code. Especially for older and stable libraries the consumers should dictate when they upgrade, not some automatic build process. But because we're all conditioned to download stuff because it may have solved some security issue we stopped to think about the security issues associated with just downloading stuff and dumping it into the build process.

Sophira7mo ago

I completely agree with you - I think that automatic downloading of dependencies when building is a bad idea.

However, for the sake of devil's advocacy, I do also want to point out that the first thing a lot of people used to do after downloading and extracting a source tarball was to run "./configure" without even looking at what it is they were executing - even people who (rightly) hate the "curl | bash" combo. You could be running anything.

Being able to verify what it is you're running is vitally important, but in the end it only makes a difference if people take the time to do so. (And running "./configure --help" doesn't count.)

4 more replies

gizmo6867mo ago

The solution I've seen employed is to prevent the build environment from reaching outside.

Setup a mirror of all the repositories you care about; then configure the network so your build system can reach the mirrors; but not the general Internet.

Of course, once you do this, you eventually create a cron job on mirrors to blindly update themselves...

This setup does at least prevent an old version of a dependency from silently changing, so projects that pin their dependencies can be confident in that. But even in those cases, you end up with a periodic "update all dependencies" ticket, that just blindly takes the new version.

kragen7mo ago

I am pretty sure Debian Policy agrees with you, although I can't cite chapter and verse. Certainly Nix and Guix agree with you. But that evidently wasn't the problem here.

dataflow7mo ago

> I think the whole idea that a build system just fetches resources from outside of the build environment is fundamentally broken

I think your phrasing is a bit overbroad. There's nothing fundamentally broken with the build system fetching resources; what's broken is not verifying what it's fetching. Audit the package beforehand and have your build system verify its integrity after downloading, and you're fine.

1 more reply

mananaysiempre7mo ago

The XZ project’s build system is and was hermetic. The exploit was right there in the source tarball. It was just hidden away inside a checked-in binary file that masqueraded as a test for handling of invalid compressed files.

(The ostensibly autotools-built files in the tarball did not correspond to the source repository, admittedly, but that’s another question, and I’m of two minds about that one. I know that’s not a popular take, but I believe Autotools has a point with its approach to source distributions.)

dijit7mo ago

I thought that the exploit was not injected into the Git repository on GitHub at all, but only in the release tarballs. And that due to how Autoconf & co. work, it is common for tarballs of Autoconf projects to include extra files not in the Git repository (like the configure script). I thought the attacker exploited the fact that differences between the release tarball and the repository were not considered particularly suspicious by downstream redistributors in order to make the attack less discoverable.

1 more reply

acka7mo ago

My apologies: yes, I edited my comment to try and clarify that I did not mean executable binaries, but rather binary data, such as the test files in the case of XZ.

dijit7mo ago

All good mate, your comment makes a better argument than the weaker one I interpreted it as prior to the edit.

1oooqooq7mo ago

how do you test your software can decompress files created with old/different implementations?

the exploit used the only solution for this problem: binary test payload. there's no other way to do it.

maybe including the source to those versions and all the build stuff to then create them programmatically... or maybe even a second repo that generates signed payloads etc... but its all overkill and would have failed human attention as the attack proved to begin with.

huflungdung7mo ago

This was a devops exploit because they used the same env for building the app as they did for the test code. Many miss this entirely and think it is because a binary was shipped.

Ideally a test env and a build env should be entirely isolated should the test code some how modify the source. Which in this case it did.

1970-01-017mo ago

>Can we trust open source software? Yes — and I would argue that we can only trust open source software.

But should we trust it? No!! That's why we're here!

I'm not satisfied with the author's double-standard-conclusion. Trust, but verify does not have some kind of hall pass for OSS "because open-source is clearly better."

Trust, but verify is independent of the license the coders choose.

rcxdude7mo ago

Yes, I would say that being able to view the source code and build it yourself is a necessary but not sufficient condition of properly trusting the software. (which is not quite the same thing as it being open source, but it's relatively rare outside of being a very big customer that you can do this for non-open-source code).

1970-01-017mo ago

What does it matter if you are able to build it all by yourself if you still don't catch the compromised code? That's what is happening here in reality. OSS is now a layer of safety that is being leveraged into a layer of compromise. Caveat emptor!

normie30007mo ago

> I would say that being able to view the source code and build it yourself is a necessary but not sufficient condition of properly trusting the software.

And certainly a condition of the "verify" step?

With closed-source software, you can (almost) _only_ trust.

17186274407mo ago

When you get the source code as a big costumer, that is open source. It might even be free software.

johnny227mo ago

many folks make a distinction between source available and open source.

1 more reply

octoberfranklin7mo ago

Yes of course, and nixpkgs (nixos) already does, although unfortunately not for this particular package.

The XZ backdoor was possible because people stick generated code (autoconf's output), which is totally impractical to audit, into the source tarballs.

In nixpkgs, all you have to do is add `autoreconfHook` to the `nativeBuildInputs` and all that stuff gets regenerated at build time. Sadly this is not the default behavior yet.

ape47mo ago

Wouldn't the next malware use a different way to embed itself

xmodem7mo ago

Why would they bother if we don't act on any of the learnings from this one?

citbl7mo ago

The next one probably won't be caught by running noticeably slower than usual.

It was a pure fluke that it got discovered _this early_.

ApolloFortyNine7mo ago

If I remember correctly it's days were numbered as soon as that redhat bug report on the valgrind errors piling up was made.

They weren't the ones to find the cause first (that's the person who took a deeper look due to the slowness), but the red flags had been raised.

paulf387mo ago

Yes indeed. The backdoor author did try to claim that it was a false positive (and I’m sure that a very depressingly large number of people would happily go along with such a claim even without a scrap of evidence).

The error was related to the use of the frame pointer. Optimised code does not use RBP as the frame pointer, only using RSP for stack addresses. The XZ backdoor code assumed that the stack used this layout. The RedHat regression tests use debug builds that do use the frame pointer. The result was the backdoor code writing below the bottom of the stack.

I suspect also that Valgrind is unique in finding issues like this. Other tools do not check all memory accesses before main. Valgrind loads and runs the test binary from the very beginning and thus it detected errors in the ifunc code used by XZ that executed very early on during ld.so loading and symbol resolution.

bluGill7mo ago

Maybe - but original ideas are hard, ideas without flaws are rare: there are reasonable odds someone will try this again.

flerchin7mo ago

> As of today only 93% of all Debian source packages are tracked in git on Debian’s GitLab instance at salsa.debian.org. Some key packages such as Coreutils and Bash are not using version control at all

This bends my brain a little. I get that they were written before git, but not before the advent of version control.

NewJazz7mo ago

Specifically the packaging is not in version control. The actual software is, but the Debian maintainer for whatever reason doesn't use source control for their packaging.

goodpoint7mo ago

The author is incorrect. Keeping the packaging files under git is done out of convenience but it does not help for security and reproducibility.

The packages uploaded in Debian are what matters and they are versioned.

crote7mo ago

And how are you supposed to verify that the right packages have been uploaded?

The easiest way to verify that is by using a reproducible automated pipeline, as that moves the problem to "were the packaging files tampered with".

How do you verify the packaging files? By making them auditable by putting them in a git repository, and for example having the packager sign each commit. If a suspicious commit slips in, it'll be immediately obvious to anyone looking at the logs.

imoverclocked7mo ago

> The easiest way to verify that is by using a reproducible automated pipeline

Conversely, this is also an attack surface. It can be easy to just hit "accept" on automated pipeline updates.

New source for bash? Seems legit ... and the source built ... "yeah, ok."

goodpoint7mo ago

Actually the uploads in Debian are signed and the build process is reproducible and audited.

Distros do not need to update packages on each and every upstream commit.

simoncion7mo ago

> I get that they were written before git, but not before the advent of version control.

  git clone https://git.savannah.gnu.org/git/bash.git
  git clone https://git.savannah.gnu.org/git/coreutils.git

Plug the repo name into https://savannah.gnu.org/git/?group=<REPO_NAME> to get a link to browse the repo.

ottokeOP7mo ago

This is the upstream source control. The article talks about the Debian packaging source not being in git (on e.g. salsa.debian.org).

simoncion7mo ago

Eh, I didn't bother to read TFA. So, it was ambiguous as to whether OP was talking about the projects or Debian's packages of the same. I figured it was more likely that OP was talking about the projects and proceeded accordingly.

If that quote's about keeping Debian packaging in source control, I don't really see much benefit for packages like coreutils and bash that generally Just Work(TM) because they're high-quality and well-tested. Sign what you package up so you can detect tampering, but I don't see you really needing anything else.

kryptiskt7mo ago

Look at the commit log in the bash repo. What good does it do if it notionally is version controlled if the commits look like this:

    2025-07-03 Bash-5.3 distribution sources and documentation bash-5.3 Chet Ramey 896 -103357/+174007

simoncion7mo ago

That looks to be the headline for the public release commit. If you'd bothered to look around for a full sixty seconds, you'd have found that the commits tagged with bash-5.3 and bash-5.2 follow that format.

Here are the headlines for a couple of fix commits:

  Bash-5.2 patch 12: fixes for compat mode leaving extglob enabled after command substitution
  Bash-5.2 patch 1: fix crash with unset arrays in arithmetic contexts

It looks like discussion of the patches happens on the mailing list, which is easy to access from the page that brought you to the repo browser.

oivey7mo ago

Ahh yes, if only the commit message was better. That would have stopped the xz attack.

1 more reply

typpilol7mo ago

Also why couldn't they start using it now?

ottokeOP7mo ago

How did the changes in the binary test files tests/files/bad-3-corrupt_lzma2.xz and tests/files/good-large_compressed.lzma, and the makefile change in m4/build-to-host.m4) manifest to the Debian maintainer? Was there a chance of noticing something odd?

Groxx7mo ago

mostly no, from my reading - it was a multi-stage chain of relatively normal looking things that added up to an exploit. helped by the tests involved using compressed data that wasn't human-readable.

you can of course come up with ways it could have been caught, but the code doesn't stand out as abnormal in context. that's all that really matters, unless your build system is already rigid enough to prevent it, and has no exploitable flaws you don't know about.

finding a technical overview is annoyingly tricky, given all the non-technical blogspam after it, but e.g. https://securelist.com/xz-backdoor-story-part-1/112354/ looks pretty good from a skim.

sanjams7mo ago

The article references a technical write-up: https://research.swtch.com/xz-script

Groxx7mo ago

ah, yes, this is one I remember seeing early on! thank you! I couldn't find much past the blogspam this time :/

XorNot7mo ago

Compression algorithms are deterministic over fixed data though (possibly with some effort).

There's no good reason to have opaque, non generated data in the repository and it should certainly be a red flag going forwards.

Groxx7mo ago

committed files with carefully crafted bad data is extremely common for testing how your code handles invalid data, especially with regression tests. and lzma absolutely needs to test itself against bad, possibly-malicious data.

2 more replies

secondcoming7mo ago

There are tons of reasons to have hand-crafted data in a repository.

dhx7mo ago

There's a few obvious gaps, seemingly still unsolved today:

1. Build environments may not be adequately sandboxed. Some distributions are better than others (Gentoo being an example of a better approach). The idea is that the package specification specifies the full list of files to be downloaded initially into a sandboxed build environment, and scripts in that build environment when executed are not able to then access any network interfaces, filesystem locations outside the build environment, etc. Even within a build of a particular software package, more advanced sandboxing may segregate test suite resources from code that is built so that a compromise of the test suite can't impact built executables, or compromised documentation resources can't be accessed during build or eventual execution of the software.

2. The open source community as a whole (but ultimately in the hands of distribution package maintainers) are not being alerted to and apply caution for unverified high entropy in source repositories. Similar in concept to nothing-up-my-sleeve numbers.[1] Typical examples of unverified high entropy where a supply chain attack can hide payload: images, videos, archives, PDF documents etc in test suites or bundled with software as documentation and/or general resources (such as splash screens in software). It may also include IVs/example keys in code or code comments, s-boxes or similar matrices or arrays of high entropy data which may not be obvious to human reviewers how the entropy is low (such as a well known AES s-box) rather than high and potentially undifferentiated from attacker shellcode. Ideally when a package maintainer goes to commit a new package or package update, are they alerted to unexplained high entropy information that ends up in the build environment sandbox and required to justify why this is OK?

[1] https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number

aborsy7mo ago

Couldn’t the submission to the Debian be possible only under real identities so that people take responsibility for what they submit?

A random person or group nobody has ever seen or knows submitted a backdoor.

0rdinal7mo ago

1. How could Debian effectively verify an identity?

2. Some people may want to remain pseudonymous for legitimate reasons.

aborsy7mo ago

It’s not straightforward.

The developers (at least important ones) could register with Debian project, just like they would with a company: submit identity and government documents, proof of physical address, bank account, credit card information, IdP account, .. It would operate like an organization.

The lead developers could meet and know each other through regular meetings. Kind of web of trust with in person verification. There are already online meetings in some projects.

sega_sai7mo ago

From reading this, it seems that one thing one can do is to be force separation of the build from testing, so the build never has access to binary code that can be injected.

charcircuit7mo ago

It shouldn't have happened in the first place. OpenSSH should control their exact dependencies and Debian shouldn't be meddling with them and swapping them out, loading random code into OpenSSH's process.

>we can only trust open source software. There is no way to audit closed source software

The ability to audit software is not sufficient, nor neccessary for it to be trustworthy.

>systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can’t audit anything

You can't audit open source vendors either.

17186274407mo ago

> Debian shouldn't be meddling with them

Debian is the OS, and the OS vendor should decide and modify the components it uses as a foundation to create the OS as he desires. That's what I am choosing Debian for and not some other OS.

> You can't audit open source vendors either.

What defines open source, is that you can request the sources for audit and modification, so I think this statement is just untrue.

charcircuit7mo ago

If Debian wants to improve or modify OpenSSH and put their own code is, they should rename it and stop using the name of the project. Debian's actions created reputational damage by introducing a backdoor into someone else's product without clearly informing the consumer that they did so.

>you can request the sources

Organizarions that open source software can have closed source infrastructure that you can't request.

17186274407mo ago

Debian is famous for modifying all programs it ships, it is more the rule than the exception. That's the deal I get when choosing Debian. SSH is more of a protocol, than a trademarked program.

> Organizarions that open source software can have closed source infrastructure that you can't request.

Which can't be a source for the program binaries, so you can still audit them, you just can't rely on e.g. their proprietary test suite.

toast07mo ago

> It shouldn't have happened in the first place. OpenSSH should control their exact dependencies and Debian shouldn't be meddling with them and swapping them out, loading random code into OpenSSH's process.

IIRC, this dependency isn't in upstream OpenSSH.

However, OpenSSH is open source with a non-restrictive license and as such, distributors (including Linux distributions) can modify it and distribute modified copies. Additionally, OpenSSH has a project goal that "Since telnet and rlogin are insecure, all operating systems should ship with support for the SSH protocol included." which encourages OS projects to include their software, with whatever modifications are (or are deemed) necessary.

Debian frequently modifies software it packages, often for better overall integration; ocassionally with negative security consequences. Adding something to OpenSSH to work better with systemd is in both categories, I guess.

lmm7mo ago

> You can't audit open source vendors either.

You can audit a lot of Debian's infrastructure - their build systems are a lot more transparent than the overwhelming majority of software vendors (which is not to say there isn't still room for improvement). You can also skip their prebuilt packages and build everything on your own systems, which of course you then have the ability to audit.

IshKebab7mo ago

That's really incidental. There are a gazillion vectors for exploitation once you control a package like xz. You can't fix this issue by plugging them one by one.

jiggawatts7mo ago

Something that the XZ back door made me realise is that the fundamental difference between proprietary and open source software is not the price or source availability for most of its users — no not developers! — it is the reputation and protected brand of the former and the anonymity of the latter.

We have no clue who “Jia Tan” is, a name certain to be a pseudonym. Nobody has seen his face. He never provided ID to a HR department. He pays no taxes to a government that links these transactions to him. There is no way to hold his feet to the fire for misdeeds.

The open source ecosystem of tools and libraries is built by hundreds of thousands of contributors, most of whom are identified by nothing more than an email. Just a string of characters. For all we know, they’re hyper-intelligent aliens subtly corrupting our computer systems, preparing the planet for invasion! I mean… that’s facetious, but seriously… how would we know if it was or wasn’t the case!? We can’t!

We have a scenario where the only protection is peer review: but we’ve seen that fail over and over systematically. Glaring errors get published in science journals all of the time. Not just the XZ attack but also Heartbleed - an innocent error - occurred because of a lack of adequate peer review.

I could waffle on about the psychology of “ownership” and how it mixes badly with anonymity and outside input, but I don’t want this to turn into war and peace.

The point is that the fundamental issue from the “outside” looking in as a potential user is that things go wrong and then the perpetrators can’t be punished so there is virtually no disincentive to try again and again.

Jia Tan is almost certainly a state-sponsored attacker. A paid professional, whose job appears to be to infect open source with back doors. The XZ attack was very much a slow burn, a part time effort. If he’s a full time employee, how may more irons did he have on the fire! Dozens? Hundreds!?

What about his colleagues? Certainly he’s not the one and only such hacker! What about other countries doing the same with their own staff of hackers?

The popular thinking has been that “Microsoft bad, open source good”, but imagine Jia Tan trying to pull something like this off with the source of Windows Server! He’d have to get employed, work in a cubicle farm, and then if caught in the act, evade arrest!

That’s a scary difference.

TheDong7mo ago

> Something that the XZ back door made me realise is that the fundamental difference between proprietary and open source software is not the price or source availability for most of its users — no not developers! - it is the reputation and protected brand of the former and the anonymity of the latter.

You're making a distinction not between open source and proprietary software but rather between hobbyist and corporate software.

There are open source projects made by companies with no external contributions allowed (sqlite sorta, most of google and amazon's oss projects in practice etc)

There are proprietary software downloads with no name attached, like practically every keygen, game crack, many indie games posted for free download on forums or 4chan, etc etc.

jiggawatts7mo ago

Some fair points, but:

> hobbyist and corporate software.

OpenSSL was maintained by like two guys in their spare time, and underpinned trillions of dollars worth of systems and secure transfers.

Would you categorise that as “hobbyist”?

The semantics matter, so I’m going to agree with you and clarify that my concern is with the risks associated with “effectively anonymous contributors allowed” software, where personal consequences for bad actors are near zero.

On the Venn diagram of software licenses and source accessibility, this “especially risky” category significantly overlaps FLOSS and has little overlap with most proprietary software products.

I personally had no bias or aversion to FLOSS software for either personal or professional use, but in all seriousness the XZ attack after the Heartbleed vulnerability made me reconsider my priors.

TheDong7mo ago

Okay, so you won't use OpenSSL because it's not proprietary enough. What do you use instead?

You pay for nginx plus? Oops, that uses openssl. F5 load balancers since you want to get even more proprietary and expensive? Some of those used OpenSSL too.

Microsoft IIS? Lemme tell you about the history of absolutely bafflingly bad vulnerabilities in that software, far worse than open source nginx ever had.

Effectively anonymous contributions are not what caused heartbleed, they're not what caused the vast majority of breaches and hacks into proprietary software companies nor the vast majority of vulnerabilities.

Bad code is what causes these bugs, and as far as I can tell, the easiest recipe to bad vulnerable code is to have a manager repeatedly tell an engineer "deliver this by friday or you're fired", which happens much less in free software projects.

I'm just trying to get a coherent idea of what you think the right thing to do here is.

How do I stay secure? What OS do I use that doesn't include a ton of open source components and reviews every line of code that goes into it? As far as I can tell, this has already excluded ChromeOS (based on open source packages, many imported without reading all the LoC), macOS (even worse, and an even greater history of vulnerabilities)... I guess windows is the best by this standard? But statistically it's also the most vulnerable, so it doesn't seem like this standard has gotten us to a logical conclusion, does it?

kragen7mo ago

I think the difference is that the undoubtedly numerous times that this has happened with Microsoft and other proprietary-software vendors, the users weren't in a position to find out.

tedunangst7mo ago

Why not? This wasn't found by source review. The computer was slow, somebody looked into why. The bug was discovered via analysis of binary artifacts, and only then traced back to the source. Bruce Dawson does this all the time on Windows.

https://randomascii.wordpress.com/category/uiforetw-2/

array_key_first7mo ago

Proprietary software typically does everything within its power to stop you introspecting it.

Also, Windows is just suspicious in general. It's slow, everything makes network requests. Finding malware in Windows is a needle in a haystack. For some perspectives, Its all malware.

uecker7mo ago

It is difficult to find out why Windows is slow again. My colleagues using Windows complain about it regularly, but not not even one ever started an investigation whether there might be backdoor or not, because this would be hopeless. With open-source it is feasible.

jiggawatts7mo ago

Okay, how, could someone like Jia Tan sneak code into a codebase where commits can only be made by authenticated users with staff accounts on a private network?

Versus… a random email offers to help, someone says “sure!”, and… that’s it. That’s the entire hurdle.

Google did discover a Chinese hacker working for them on the payroll. That kind of thing does occur, but it’s rare.

It’s massively harder and more risky.

schuyler2d7mo ago

Well, xz is a rare event too.

There's no knowing how many backdoors were added by small network companies or contractors. But there's rarely accountability when it happens because the company would rather cover it up, or just not ask too many questions about that weird bug

1 more reply

dessimus7mo ago

Something like this has happened in the proprietary world: the SolarWinds supply chain attack. IIRC, they were releasing breached versions for about a year, and I think it became known only when the US Government came knocking on SolarWinds door. SolarWinds potentially vetting every employee through HR had zero effect on preventing a supply chain attack.

Tuhbrook7mo ago

Dude what did you smoke? Or are you a state actor yourself? 3 sentences in and there is zero logic, 100 fear mongering in your text.

jiggawatts7mo ago

Says the anonymous green name, missing the point entirely.

jart7mo ago

Folks have been ringing the alarm bell for a decade. https://www.nongnu.org/lzip/xz_inadequate.html xz is insane because it appears to be one of the most legitimately dangerous compression formats with the potential to gigafry your data but is exclusively used by literal turbonormies who unironically want to like "shave off a few kilobytes" and basically get oneshotted by it.

Delk7mo ago

The question of whether the xz format is a good choice for long-term archival is entirely unrelated to backdoors or open source supply chain security.

jart7mo ago

No they're the same. Why do you think xz was targeted? It's a giant slippery hairball.

Delk7mo ago

> Why do you think xz was targeted?

Possibly for any number of reasons. A sole maintainer with a bit too little capacity to keep up the development. A central role as a dependency for crucial packages in a couple of key distros.

What would be the connection between the backdoor (or indeed any supply chain security) and any design details of the xz file format? How would the backdoor have been avoided if the archive format were different?

1 more reply

tredre37mo ago

Turbonormies, as you say, tend to use gzip not xz. Which is sad because gzip is just as bad for archiving. A few bytes changed and your entire file is lost (in a .tar.gz it means everything is lost).

Frankly, tarballs are an embarrassing relic, and it's not the turbonormies that insist they're still fit for purpose. They don't know any better, they'll do what people like you tell them to do.

1 more reply

j / k navigate · click thread line to collapse

110 comments

acka7mo ago

dijit7mo ago

thats not the issue, there will always be prebuilt binaries (hell, deb/rpm are prebuilt binaries).

The issue for xz was that the build system was not hermetic (and sufficiently audited).

jacquesm7mo ago

Sophira7mo ago

I completely agree with you - I think that automatic downloading of dependencies when building is a bad idea.

Being able to verify what it is you're running is vitally important, but in the end it only makes a difference if people take the time to do so. (And running "./configure --help" doesn't count.)

4 more replies

gizmo6867mo ago

The solution I've seen employed is to prevent the build environment from reaching outside.

Setup a mirror of all the repositories you care about; then configure the network so your build system can reach the mirrors; but not the general Internet.

Of course, once you do this, you eventually create a cron job on mirrors to blindly update themselves...

kragen7mo ago

I am pretty sure Debian Policy agrees with you, although I can't cite chapter and verse. Certainly Nix and Guix agree with you. But that evidently wasn't the problem here.

dataflow7mo ago

> I think the whole idea that a build system just fetches resources from outside of the build environment is fundamentally broken

1 more reply

mananaysiempre7mo ago

dijit7mo ago

1 more reply

acka7mo ago

My apologies: yes, I edited my comment to try and clarify that I did not mean executable binaries, but rather binary data, such as the test files in the case of XZ.

dijit7mo ago

All good mate, your comment makes a better argument than the weaker one I interpreted it as prior to the edit.

1oooqooq7mo ago

how do you test your software can decompress files created with old/different implementations?

the exploit used the only solution for this problem: binary test payload. there's no other way to do it.

huflungdung7mo ago

This was a devops exploit because they used the same env for building the app as they did for the test code. Many miss this entirely and think it is because a binary was shipped.

Ideally a test env and a build env should be entirely isolated should the test code some how modify the source. Which in this case it did.

1970-01-017mo ago

>Can we trust open source software? Yes — and I would argue that we can only trust open source software.

But should we trust it? No!! That's why we're here!

I'm not satisfied with the author's double-standard-conclusion. Trust, but verify does not have some kind of hall pass for OSS "because open-source is clearly better."

Trust, but verify is independent of the license the coders choose.

rcxdude7mo ago

1970-01-017mo ago

normie30007mo ago

> I would say that being able to view the source code and build it yourself is a necessary but not sufficient condition of properly trusting the software.

And certainly a condition of the "verify" step?

With closed-source software, you can (almost) _only_ trust.

17186274407mo ago

When you get the source code as a big costumer, that is open source. It might even be free software.

johnny227mo ago

many folks make a distinction between source available and open source.

1 more reply

octoberfranklin7mo ago

Yes of course, and nixpkgs (nixos) already does, although unfortunately not for this particular package.

The XZ backdoor was possible because people stick generated code (autoconf's output), which is totally impractical to audit, into the source tarballs.

In nixpkgs, all you have to do is add `autoreconfHook` to the `nativeBuildInputs` and all that stuff gets regenerated at build time. Sadly this is not the default behavior yet.

ape47mo ago

Wouldn't the next malware use a different way to embed itself

xmodem7mo ago

Why would they bother if we don't act on any of the learnings from this one?

citbl7mo ago

The next one probably won't be caught by running noticeably slower than usual.

It was a pure fluke that it got discovered _this early_.

ApolloFortyNine7mo ago

If I remember correctly it's days were numbered as soon as that redhat bug report on the valgrind errors piling up was made.

They weren't the ones to find the cause first (that's the person who took a deeper look due to the slowness), but the red flags had been raised.

paulf387mo ago

bluGill7mo ago

Maybe - but original ideas are hard, ideas without flaws are rare: there are reasonable odds someone will try this again.

flerchin7mo ago

This bends my brain a little. I get that they were written before git, but not before the advent of version control.

NewJazz7mo ago

Specifically the packaging is not in version control. The actual software is, but the Debian maintainer for whatever reason doesn't use source control for their packaging.

goodpoint7mo ago

The author is incorrect. Keeping the packaging files under git is done out of convenience but it does not help for security and reproducibility.

The packages uploaded in Debian are what matters and they are versioned.

crote7mo ago

And how are you supposed to verify that the right packages have been uploaded?

The easiest way to verify that is by using a reproducible automated pipeline, as that moves the problem to "were the packaging files tampered with".

imoverclocked7mo ago

> The easiest way to verify that is by using a reproducible automated pipeline

Conversely, this is also an attack surface. It can be easy to just hit "accept" on automated pipeline updates.

New source for bash? Seems legit ... and the source built ... "yeah, ok."

goodpoint7mo ago

Actually the uploads in Debian are signed and the build process is reproducible and audited.

Distros do not need to update packages on each and every upstream commit.

simoncion7mo ago

> I get that they were written before git, but not before the advent of version control.

  git clone https://git.savannah.gnu.org/git/bash.git
  git clone https://git.savannah.gnu.org/git/coreutils.git

Plug the repo name into https://savannah.gnu.org/git/?group=<REPO_NAME> to get a link to browse the repo.

ottokeOP7mo ago

This is the upstream source control. The article talks about the Debian packaging source not being in git (on e.g. salsa.debian.org).

simoncion7mo ago

kryptiskt7mo ago

Look at the commit log in the bash repo. What good does it do if it notionally is version controlled if the commits look like this:

    2025-07-03 Bash-5.3 distribution sources and documentation bash-5.3 Chet Ramey 896 -103357/+174007

simoncion7mo ago

Here are the headlines for a couple of fix commits:

  Bash-5.2 patch 12: fixes for compat mode leaving extglob enabled after command substitution
  Bash-5.2 patch 1: fix crash with unset arrays in arithmetic contexts

It looks like discussion of the patches happens on the mailing list, which is easy to access from the page that brought you to the repo browser.

oivey7mo ago

Ahh yes, if only the commit message was better. That would have stopped the xz attack.

1 more reply

typpilol7mo ago

Also why couldn't they start using it now?

ottokeOP7mo ago

Groxx7mo ago

finding a technical overview is annoyingly tricky, given all the non-technical blogspam after it, but e.g. https://securelist.com/xz-backdoor-story-part-1/112354/ looks pretty good from a skim.

sanjams7mo ago

The article references a technical write-up: https://research.swtch.com/xz-script

Groxx7mo ago

ah, yes, this is one I remember seeing early on! thank you! I couldn't find much past the blogspam this time :/

XorNot7mo ago

Compression algorithms are deterministic over fixed data though (possibly with some effort).

There's no good reason to have opaque, non generated data in the repository and it should certainly be a red flag going forwards.

Groxx7mo ago

2 more replies

secondcoming7mo ago

There are tons of reasons to have hand-crafted data in a repository.

dhx7mo ago

There's a few obvious gaps, seemingly still unsolved today:

[1] https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number

aborsy7mo ago

Couldn’t the submission to the Debian be possible only under real identities so that people take responsibility for what they submit?

A random person or group nobody has ever seen or knows submitted a backdoor.

0rdinal7mo ago

1. How could Debian effectively verify an identity?

2. Some people may want to remain pseudonymous for legitimate reasons.

aborsy7mo ago

It’s not straightforward.

The lead developers could meet and know each other through regular meetings. Kind of web of trust with in person verification. There are already online meetings in some projects.

sega_sai7mo ago

From reading this, it seems that one thing one can do is to be force separation of the build from testing, so the build never has access to binary code that can be injected.

charcircuit7mo ago

>we can only trust open source software. There is no way to audit closed source software

The ability to audit software is not sufficient, nor neccessary for it to be trustworthy.

>systems of a closed source vendor was compromised, like Crowdstrike some weeks ago, we can’t audit anything

You can't audit open source vendors either.

17186274407mo ago

> Debian shouldn't be meddling with them

Debian is the OS, and the OS vendor should decide and modify the components it uses as a foundation to create the OS as he desires. That's what I am choosing Debian for and not some other OS.

> You can't audit open source vendors either.

What defines open source, is that you can request the sources for audit and modification, so I think this statement is just untrue.

charcircuit7mo ago

>you can request the sources

Organizarions that open source software can have closed source infrastructure that you can't request.

17186274407mo ago

Debian is famous for modifying all programs it ships, it is more the rule than the exception. That's the deal I get when choosing Debian. SSH is more of a protocol, than a trademarked program.

> Organizarions that open source software can have closed source infrastructure that you can't request.

Which can't be a source for the program binaries, so you can still audit them, you just can't rely on e.g. their proprietary test suite.

toast07mo ago

IIRC, this dependency isn't in upstream OpenSSH.

lmm7mo ago

> You can't audit open source vendors either.

IshKebab7mo ago

That's really incidental. There are a gazillion vectors for exploitation once you control a package like xz. You can't fix this issue by plugging them one by one.

jiggawatts7mo ago

I could waffle on about the psychology of “ownership” and how it mixes badly with anonymity and outside input, but I don’t want this to turn into war and peace.

What about his colleagues? Certainly he’s not the one and only such hacker! What about other countries doing the same with their own staff of hackers?

That’s a scary difference.

TheDong7mo ago

You're making a distinction not between open source and proprietary software but rather between hobbyist and corporate software.

There are open source projects made by companies with no external contributions allowed (sqlite sorta, most of google and amazon's oss projects in practice etc)

There are proprietary software downloads with no name attached, like practically every keygen, game crack, many indie games posted for free download on forums or 4chan, etc etc.

jiggawatts7mo ago

Some fair points, but:

> hobbyist and corporate software.

OpenSSL was maintained by like two guys in their spare time, and underpinned trillions of dollars worth of systems and secure transfers.

Would you categorise that as “hobbyist”?

On the Venn diagram of software licenses and source accessibility, this “especially risky” category significantly overlaps FLOSS and has little overlap with most proprietary software products.

I personally had no bias or aversion to FLOSS software for either personal or professional use, but in all seriousness the XZ attack after the Heartbleed vulnerability made me reconsider my priors.

TheDong7mo ago

Okay, so you won't use OpenSSL because it's not proprietary enough. What do you use instead?

You pay for nginx plus? Oops, that uses openssl. F5 load balancers since you want to get even more proprietary and expensive? Some of those used OpenSSL too.

Microsoft IIS? Lemme tell you about the history of absolutely bafflingly bad vulnerabilities in that software, far worse than open source nginx ever had.

I'm just trying to get a coherent idea of what you think the right thing to do here is.

kragen7mo ago

I think the difference is that the undoubtedly numerous times that this has happened with Microsoft and other proprietary-software vendors, the users weren't in a position to find out.

tedunangst7mo ago

https://randomascii.wordpress.com/category/uiforetw-2/

array_key_first7mo ago

Proprietary software typically does everything within its power to stop you introspecting it.

Also, Windows is just suspicious in general. It's slow, everything makes network requests. Finding malware in Windows is a needle in a haystack. For some perspectives, Its all malware.

uecker7mo ago

jiggawatts7mo ago

Okay, how, could someone like Jia Tan sneak code into a codebase where commits can only be made by authenticated users with staff accounts on a private network?

Versus… a random email offers to help, someone says “sure!”, and… that’s it. That’s the entire hurdle.

Google did discover a Chinese hacker working for them on the payroll. That kind of thing does occur, but it’s rare.

It’s massively harder and more risky.

schuyler2d7mo ago

Well, xz is a rare event too.

1 more reply

dessimus7mo ago

Tuhbrook7mo ago

Dude what did you smoke? Or are you a state actor yourself? 3 sentences in and there is zero logic, 100 fear mongering in your text.

jiggawatts7mo ago

Says the anonymous green name, missing the point entirely.

jart7mo ago

Delk7mo ago

The question of whether the xz format is a good choice for long-term archival is entirely unrelated to backdoors or open source supply chain security.

jart7mo ago

No they're the same. Why do you think xz was targeted? It's a giant slippery hairball.

Delk7mo ago

> Why do you think xz was targeted?

Possibly for any number of reasons. A sole maintainer with a bit too little capacity to keep up the development. A central role as a dependency for crucial packages in a couple of key distros.

1 more reply

tredre37mo ago

Frankly, tarballs are an embarrassing relic, and it's not the turbonormies that insist they're still fit for purpose. They don't know any better, they'll do what people like you tell them to do.

1 more reply

j / k navigate · click thread line to collapse