It took a while though until this was understood. In 2007 when pointing out on debian-devel that this is needed, I was still told what huge waste of time this would be. And indeed it took a huge amount of work by many people to get there, but it is well worth it.
"Well worth it" is not correct. And it just ups the the contribution barrier to Debian higher, I already heard a lot of people complaining that contributing to Debian is hard and while in past I defended it by "they need all the checks and bounds to make sure packages play with eachother nicely", this is just step that makes it hard for no reason and little benefit.
https://reproducible-builds.org/
Could you perhaps respond to the argumentation here?
Forcing devs to use hardware keys to sign commits/CI requests would be actual security improvement, thwarting many supply chain attacks that only worked coz the attacker got to developer credentials. Hardware keys at least have option to make some operations require physically pressing the key so there is chance developer will notice.
But thanks to reproducible builds, at least someone can... validate that the binary code of vulnerable package can be reproduced. Very fucking useful.
I am not saying it is useless. I am saying it is one of highest hanging fruits on security tree.
Have many organizations produce the binaries independently and post the arifacts.
Once n of m parties agree on the arifact hash, take that as the trusted build.
If every party reaches a different hash then we cannot build consensus.
Obviously, it would be a ton of work to make such a system resistant to gaming by malicious actors (see GNU Guix for useful efforts), but it would provide valuable diversity in architecture and (political or other) control.
It would be even cooler if we could have independent projects that could run on various distros and OS, and build packages for any of them. Having packages for bsd verified on linux and vice-versa with statistical logging (this code has been verified x times on y OSes) would be reassuring.
Anyone having to maintain a code base or a distributed fleet of devices will gain from this decision, immensely, as their operational periods come and go.
Reproducible builds are about longevity as much as they are about security.
Please don’t make bold claims about ‘no reason and little benefit’ while demonstrating ignorance of this hard fact: reproducible builds should have been the norm, in computing, from the get-go.
Just baking in build ID and commit is enough. What you think reproducible builds add ?
> Please don’t make bold claims about ‘no reason and little benefit’ while demonstrating ignorance of this hard fact: reproducible builds should have been the norm, in computing, from the get-go.
So far not a single person in the thread gave me concrete example (as in existing project, existing problem, no other solution can solve it). Just claiming it's better based on their feelings. Come on, be the first one.
(It was caught before being promoted into a stable Debian release, yes, but this sort of relied on a happy accident, too close for comfort)
Still, lots of good non-security benefits to reproducible builds too.
The backdoor relied first on a difference between building a package in a packaging environment versus building the package on your own. And also, it relied on the very common practice of checking in unreviewable artifacts into the source tree (e.g., the configure script, malicious binary test artifacts).
Reproducible builds guarantee that two people can follow the same instructions and get the same, bit-identical outcome. It does nothing to guarantee that those instructions have not been compromised, and all of the great packaging security failures of my lifetime that I can think of have relied on those instructions being compromised (e.g., xz utils, Debian OpenSSL keygen issues).
Defense in depth obviously is a good thing
If anything it will make attacker's job easier, as Ubuntu package will have same files structured exactly same way as Debian one.
Like what exactly?
Those people do not care about quality in opensource at all. For longliving software this is very important.
Of course, all those javascript and kubernetes packages which are irrelevant in a few years again, might complain, but let them complain.
I'm reading this as a suggestion that the reproducible builds effort was an ineffective deterrent.
However, note that your observation could also be explained by the opposite: the reproducible builds effort was an effective deterrent, so nobody bothered with attempts.
> And it just ups the the contribution barrier to Debian higher
Until yesterday, the package just got flagged in the tracker, and you could either ignore it, or fix it yourself, or the kind people behind the reproducible builds effort supplied a patch themselves.
Now, you can no longer ignore it. But fixes are often trivial. Use a (stable) timestamp provided by the build, seed RNGs with some constant (instead of eg: time), etc. These are best practices anyway.
There was no attack that reproducible builds would help protect from before 2007 either.
> Until yesterday, the package just got flagged in the tracker, and you could either ignore it, or fix it yourself, or the kind people behind the reproducible builds effort supplied a patch themselves.
> Now, you can no longer ignore it. But fixes are often trivial. Use a (stable) timestamp provided by the build, seed RNGs with some constant (instead of eg: time), etc.
that's the entirety of the problem. App developers don't want to be package experts or build experts.
> These are best practices anyway.
They are not. They are best practices if you want reproducible builds. They are entirely useless waste of time if you don't care.
Most with failed to reproduce: NT_GNU_BUILD_ID. The others on some other bits. Mostly timestamps or hashes I assume
(Orange = FTBR = "failed to build reproducibly")
I'm not good at reading numbers from charts, but I'd guess it's a few percent (4-5ish?).
> Forbidden
> <p>You are not allowed to access this!</p>
(yes, with HTML tags on display) :)
EDIT: I also found a "I Challenge Thee" page in history. did I just get blocked by antibot measures? why???
BTW, most Debian packages have reproducible builds. Those which have not (I'd say 5%) are shown in orange in the graph there: https://wiki.debian.org/ReproducibleBuilds
I think Magnus Ihse Bursie said it best while working on reproducible builds of OpenJDK: "If you were to ask me, the fact that compilers and build tools ever started to produce non-deterministic output has been a bug from day one." [2]
[1] https://www.linux.com/news/preventing-supply-chain-attacks-l...
[2] https://github.com/openjdk/jdk/pull/9152#issue-1270543997
Reproducible builds are an essential method in industrial computing - Debian isn’t at the forefront of this, it is merely adopting industry wide techniques also applied to other operating systems in use in long-term and safety-related applications.
Certainly, a lot of the hard work of the Yocto and Debian developers is already in your hands.
What is interesting is that this is now being applied in a more forward-focused policy by the Debian developers, that it will now be the norm rather than an option…
reproduced: 97.02% good: 17586 bad: 511 fail: 30 unknown: 0
This, statistics for other architectures, and the reasons for unreproducibility can be found at https://reproduce.debian.net.
You don't have permission to access this resource. Apache Server at lists.debian.org Port 443
:/
It does work with my privacy/scrapping setup (residential proxy, spoofed fingerprints, Qubes and so on), great job debian.
They're a guarantee that if there's a backdoor, it's reproducible 100% of the time.
This is a godsend for white hats fighting the good fight.
And, as a side note, it's strongarming vs the bad guys: "Would be too bad if we could reproduce your shiny exploit 100% of the time wouldn't it!?".
Note that we should go further (but it's a bit orthogonal to reproducible builds): builds of the final binary/package should happen by first entirely discarding all files not necessary for the final build (like all test cases and all test assets). The build should literally happen in an environment that gets rid of those (after, of course, having test in another environment that all tests cases succeed): if I'm not mistaken get rid of test assets would have stopped Jia Tan's XZ backdoor attempt dead in its track (for example). Because IIRC there were binary data part of the backdoor hidden in some asset only used by test cases.
P.S: as a bonus they also allow to detect bit-flips (I'm not saying there aren't other ways to detect bit-flips: what I'm saying is that if you have deterministic builds anyway and something doesn't reproduce correctly due to a flipped-bit, it's going to be noticed).
It feels like AI and traditional software are converging in complexity.
The build timestamps in the PE header and export table are also a problem as well.
giant leap for mankind.
What is a win is that two independent parties can run the same build, and get the same binaries.
This is important because it removes trust from builders: anyone can verify their output.
It just so happens that unimportant things like build versions impede that.
This has been the status quo in Debian for a while now. You can build, and use diffoscope to audit the differences.
It's a stronger security property to have bit-for-bit reproducibilty, and it looks like Debian are ready to commit to it.
Given how many quick & dirty sed patching or exec commands I've seen in the few nix package/modules I've read, I would not exactly bet my life on it being completely idempotent & reproducible.
It's not reproducible bit by bit, it fetch the current version of anything, but it's still easy to reproduce enough, stable enough and complete enough, while classic distros need a fresh install every major release or facing issues an keeping a system in unknown state for long until it explode.
They're still a pragmatic choice for many usecases.
Maybe not by itself, but it does allow for the ecosystem to be audited, in a way that ultimately benefits the end-user. It really is an important part of a healthy supply chain.
Curious, what distros where affected by npm supply chain attacks?
Not being able to see if the source code shipped is the same as been used for creating the binary is scary
Reproducable builds are not solving all issues as you rightly observed, but they can be a stepping stone (or even a pre-condition) for further measures.
The thing reproducible builds aim to prevent is Debian or individual developers and system administrators with access rights to binary uploads and signing keys to get forced to sign and upload binary packages by attackers - be these governments (with or without court orders) or criminal organizations.
As of now, say if I were an administrator of Debian's CI infrastructure, technically there would be nothing preventing me from running an "extra" job on the CI infrastructure building a package for openssh with a knock-knock backdoor, properly signing it and uploading it to the repository. For someone to spot the attack and differentiate it, they'd have to notice that there is a package in the repository that has no corresponding build logs or has issues otherwise.
But with reproducible builds, anyone can set up infrastructure to rebuild Debian packages from source automatically and if there is a mismatch with what is on Debian's repository, raise alarm bells.
Indeed, this could mitigate an attacker replacing the binary with something that's not produced from the code, but it does not mitigate the tool chain or code itself containing the exploit, creating a malicious binary.