Static Linux (opens in new tab)

(sta.li)

154 pointsjoseflavio11y ago57 comments

57 comments

I wish Linux would replace dynamic libraries (especially ones referencing specific paths) with a system based on the library's hash. Then we could have a single lib folder with 1 copy of each library, and get ourselves out of dependency hell by just making dynamic loading act like static loading. We could even download libraries from the web on the fly, if needed. Heck it would even remove a lot of the need to compile every_single_time because apps could reference binaries.

The notion of being able to fix an app by merely upgrading a library it depends on has not worked out in practice. More often than not, when I upgrade a library, I find myself having to upgrade my app’s code because so much has changed. The burden of having to constantly backup, upgrade, manually tweak config files, over and over and over again for days/weeks/months was SO not worth the few hundred megabytes or whatever dynamic loading was supposed to have saved.

deathanatos11y ago

> I wish Linux would replace dynamic libraries (especially ones referencing specific paths) with a system based on the library's hash.

Given this scheme, how would you distribute a security patch? Is each user of the library supposed to re-compile against the patched library?

Also, a program A depends on library B and library C v1.1. Library B also depends on C, but v1.2. Which gets used?

> More often than not, when I upgrade a library, I find myself having to upgrade my app’s code because so much has changed.

To me, this is the point of major version numbers. If you break clients, you increment the major version number, resulting in libfoo.so.2 and libfoo.so.3. Then, the scheme becomes much like hashes, in that newer versions won't break older clients, except you get security patches and a single copy of the library. However, the responsibility of knowing when to increment the major is left to a human, and all the error that entails.

As a sibling notes, there are distros out there that do this. (They are not my preference, for the above reasons.)

zackmorris11y ago

Ya it would basically ignore names/versions and just use the hash as the identifier, a bit like BitTorrent.

Unfortunately you do bring up a good point that security patches generally won't work without a recompile of the parent binary. One possible way out of this is that the external interface/unit tests for a library could also have their own hash, so if a library fixes something like a buffer exploit without changing its interface, the parent binary could use the new drop-in replacement. In practice I’m skeptical if this would work reliably though, because binaries may be relying on idiosyncrasies.

I'm thinking that a simpler method would be to have patches use the hash system. So say curl uses libssl and libssl releases a security update, then someone could drop the new libssl into the curl project and rebuild it without having to touch any code, giving curl a new hash that could be installed by other users. I think we are used to this almost never working so we are hesitant to upgrade. But the idea would be that we’d upgrade binaries that depend on libraries (rather than just libraries) and it would be a really cheap operation compared to today.

noobermin11y ago

I'm no expert, but based on the parent, I think it comes down to what he means by "library hash." If it's a hash (like a checksum) of the whole shared library, then different versions would have different hashes, theoretically.

Tobu11y ago

Distro packagers generate a listing of public symbols (tagging a symbol with the version number when it is introduced) to catch incompatibilities. It's like static typing, it takes a bit of getting used to but it's very reliable.

https://www.debian.org/doc/manuals/maint-guide/advanced.en.h...

chongli11y ago

>Given this scheme, how would you distribute a security patch? Is each user of the library supposed to re-compile against the patched library?

No, this is the responsibility of the server which distributes binaries to users via the package manager.

1 more reply

witten11y ago

> I wish Linux would replace dynamic libraries (especially ones referencing specific paths) with a system based on the library's hash.

This is how NixOS works: http://nixos.org/

coherentpony11y ago

> The notion of being able to fix an app by merely upgrading a library it depends on has not worked out in practice.

It works in practice if developers and maintainers adhere to semantic versioning. Unfortunately, there are numerous packages that don't adhere to this standard and that's when widespread breakage occurs.

vidarh11y ago

An example of where this has worked (though admittedly the reason it has continued to work as long as it has is that the size of the community has dwindled down to a very small, manageable number over the last 20 years) is AmigaOS, where there are 30 year old libraries that are still updated, and where the updates are still expected to be drop-in replacements for the previous version.

zokier11y ago

And if developers would just write bug-free code then also life would be much simpler. Sadly neither of these things is going to happen, so we need to deal with it somehow.

ultramancool11y ago

Cool idea. Not enough people seem aware of executable packers like UPX http://upx.sourceforge.net/ though.

These are excellent tools to keep the size down when using large static binaries. By compressing the file on disk and decompressing in memory you often wind up with a smaller and sometimes even faster loading (depending on disk IO speed vs decompression speed) package. I got a static Qt binary from 4 MB down to 1.3 with upx --lzma. Very nice stuff.

Lerc11y ago

The downside to tools like UPX is that the executable code is actively transformed on load. This limits the ability to use shared memory for multiple concurrent executions of the same executable.

If the OS loads executables by mmap and load on page-hit, you can potentially save memory by not ever loading unused parts of an executable. a transform-on-load requires the entire program to be loaded before execution begins.

GeorgeTirebiter11y ago

'... never loading unused parts of an executable...' - this is one of the benefits of a paging system. Paging was (is?) the best way to keep memory requirement down. The way Multics worked was by having paged segments. http://www.osinfoblog.com/post/136/segmentation-with-paging:... The x86(-64) can support something like this, but as far as I know, no modern OS supports this feature.

1 more reply

rab_oof11y ago

Ah, and pklite pro and lzexe from the olden days for winDOwS shit once upon a time.

Though the goal should be generating less code. Link in fewer dependencies, reduce features, DRY up duplicate logic and cut LoC. Also compile with -DNDEBUG -O2 -g- and whatever LTO switches are available for whole program optimization if you're statically linking everything together. Also be sure to include static dependencies of other static dependencies like zlib (-lz), or you'll inevitably end up with missing symbol errors when compiling a final program. LTO cuts out all (most) of the shit that you don't need and attempts to optimize across translation units.

Furthermore a consideration against static linking, on most platforms, if the same shared library is already loaded, it's reused by mmaping it into a process. Not sure that duplicating code is going to reduce memory usage or the IO it takes to load from disk. Giant runtimes like Go, Ruby, Python and fucking Java shouldn't be duplicated N times... That's just wasteful. (I hate any language with an epic runtime or VM that includes the world to do anything.). Libraries should be reserved for the few redundant things that take tons of code to implement and change very little.

If anyone wants to compile a Linux system from scratch, try LFS and hackaround with static linking. It may take patches, extra flags to get what you want.

Hope Static Linux scales, because it's easier to upgrade static programs without dependency hell but the increased memory usage of duplicated code might not be so great of a tradeoff.

Another hack would be to statically compile every system program in each directory together (/sbin/, /bin/, and parts of /usr/bin, etc) into a single executable per directory that is then symlinked to itself to select which "program" to load via argv[0]. It will be one giant exe per directory, but it will be cached basically all the time and with LTO, there won't be much duplication as with N programs compiled separately. This would take a main which dispatches to other renamed original mains and renaming all symbol conflicts across all translation units.

    /bin/static
    /bin/[ -> /bin/static
    /bin/false -> /bin/static
    /bin/true -> /bin/static

(Probably want to use hard links also)

justincormack11y ago

Crunchgen [1] does that static compile, merge and symbol rename trick for you

[1] http://netbsd.gw.com/cgi-bin/man-cgi?crunchgen++NetBSD-curre...

voltagex_11y ago

I found a bug with an offline documentation reader (can't remember which) that if you pack one of the QT components (qwindows.dll) the application wouldn't load anymore. Still not sure whether to report a bug and who to report it to.

Yes UPX is cool, but I don't think it's 100% compatible.

ultramancool11y ago

You can usually set UPX options to fix that kind of thing. You may have stripped necessary linking information. I usually use it on statically linked things though, so no issue there.

stonogo11y ago

I have been keeping an eye on this "project" for years and have yet to see anything come of it except lightning talks.

suckless.org seems to focus on their web browser and their xterm clone these days, judging by the listserv traffic.

saysjonathan11y ago

Sin, the original author and maintainer of a few of the suckless utils necessary for Stali, has a similar project called Morpheus[0]. He has actually shipped a bootable image for testing and has more or less gotten the packaging sorted out[1].

If you're interested in the idea, definitely check it out.

0: http://morpheus.2f30.org 1: http://morpheus.2f30.org/0.0/packages/x86_64/

vezzy-fnord11y ago

A similarly interesting project is the musl-based Sabotage Linux with its own tiny but concurrent package manager: https://github.com/sabotage-linux/sabotage

2 more replies

agumonkey11y ago

Never even heard of it, amazing. #xmas

TsukasaUjiie11y ago

"Of course Ulrich Drepper thinks that dynamic linking is great, but clearly that’s because of his lack of experience and his delusions of grandeur." I find these kind of comments reflect veerryyy poorly on their authors..

dfc11y ago

  > Because dwm is customized through editing its source code, it’s
  > pointless to make binary packages of it. This keeps its userbase
  > small and elitist. No novices asking stupid questions.

I never understood why the authors of dwm thought this was a "nice" feature of configuration via source code.

ertdfgcb11y ago

They don't mind if DWM is only ever used by a handful of people if only a handful of people agree with the principles behind it.

vidarh11y ago

Consider that their goal is not to have a huge community, but to have a piece of software that does what they want, and that dealing with a community is a lot of work. So they've chosen to use this as a filter in order to exclude people they perceive as causing more trouble than it's worth, and limit their community to people who pass through their "trial by fire" of having to be both willing and able to deal with compiling from source.

sparkie11y ago

The other kinds of configuration aren't that much better.

The most common method of configuration on linux is to include a parser for one of many shitty text or markup formats (whatever is currently "hip", so JSON at the moment), then carefully bind each variable you might want to modify to a key/value mapping extracted from the config file - and if you want to keep the sanity of your users, include verbose error messages or even a debugger so they can fix their inevitable typos.

The way configuration works on Windows and Mac is largely the same, except you wrap a GUI around the text file to handle the validation of inputs, which is a slight improvement over text input.

The problem with those input methods is they don't exactly allow you to configure much. You have to decide ahead of time all of the possible variables that one might want to change - and even then, you can't even compute new values to set the variables to, unless you embed an interpreter into your configuration format. As the program grows and gains more features, the configuration format needs amending, and grows uglier - which is what leads to Greenspun's tenth rule. Configuration files have their place - but most of the time, they're used where it'd be best to just have a programming language available.

I don't necessarily think dwm's idea of configuration via C is a great idea though, since they're not interpreting it and recompiling the whole program to make and test changes is a headache. Configuration via source code is the way to go, except it should be interpreted while the program is running, such that you only need to recompile for major breaking changes. Xmonad is configured via source code, but they have a separate process for your configuration, such that when you change it, the config is recompiled and the program relaunched without restarting the whole system. I'd personally opt to embed a Scheme into a WM, but that would probably go against suckless's minimalist philosophy.

leakybucket11y ago

An advantage of dynamic libraries is that the memory used to hold the library's executable pages can be shared across processes. So using static only binaries will lead to less free memory on the OS.

Animats11y ago

That's the party line. It's often wrong. If two copies of the same program are running, they share memory for code. For a shared library to reduce memory consumption, there must be multiple different programs using the same version of the same library. That's not all that common, beyond very basic libraries such as "libc".

Linking to a shared library brings in and initializes the whole library, even if you only need one function from it. So you tend to get stuff paged in during load that never gets used.

icebraining11y ago

That's not all that common

Isn't it? Usually distros target their packages to a single library version, and often people run suites (Gnome, KDE, etc) that use a similar set of libraries in their different processes.

1 more reply

gnufx11y ago

Apart from the general system management advantages of dynamic libraries, they provide an important extensibility/customization mechanism (e.g. https://www.gnu.org/software/emacs/emacs-paper.html for an early mention).

High performance computing systems typically use dynamic linking extensively for that. One example: The hooks for profiling and tool support in the MPI standard for parallel programming pretty much depend on an LD_PRELOAD-type mechanism to be useful. Another: You can cope with the mess due to missing policy on BLAS libraries in Fedora/EPEL (unlike Debian) by using OpenBLAS replacements for the reference and ATLAS libraries; have them preferred by ld.so.conf and get a worthwhile system-wide speed increase on your Sandybridge nodes with EPEL packages.

Anyhow, rebuilding a static system to address a problem with a library ignores all its uses in user programs. The ability to adjust things via properly-engineered dynamic libraries really has a lot more pros than cons in my non-trivial experience. The use of rpath ("ones referencing specific paths"?) is mostly disallowed by packaging rules in the GNU/Linux distributions I know, so I'm not sure where that comment came from, and it tends to defeat the techniques above.

curlyquote11y ago

Static linking OpenSSL is probably not a good idea

AjithAntony11y ago

You mean for security patches? i.e rebuilding all your binaries, instead of just openssl's shared libs

w8rbt11y ago

One downside to static linking is security vulnerabilities. Say, for example that all the programs on your computer that use OpenSSL statically link. When OpenSSL has a security flaw, you have to not only update OpenSSL, but all of those other programs too.

proveanegative11y ago

Are there any static BSDs?

justincormack11y ago

You can build NetBSD to be statically linked [1]. It is not the default, but it is trivial to do, one line config change. Note that you can build NetBSD from source on any system, eg Linux or OSX and then boot it in a VM, so you dont really need a binary "distro".

[1] https://www.netbsd.org/docs/guide/en/chap-build.html#chap-bu...

zzzcpan11y ago

Picobsd maybe?

kkedacic11y ago

I would like to see "static" USE flag in Gentoo for all packages with ability to statically link whole system. Also profiles or env with other libc's would be nice. Dunno why they need to crate new distribution instead of expanding ones that are already there.

enthdegree11y ago

Stali has been in 'planning' for as long as I can remember. At least 4 years.

spiritplumber11y ago

Wonder if it helps with dep hell...

bch11y ago

It would for a certain class of hell. You won't get issues w/ resolving symbols in dynamic libs, but you _would_ open yourself to feature disparity across different apps that use similar libs. For example, if you've got a Client-A that uses libfoo feature X(v1), and Client-B that shipped w/ and links against libfoo feature X(v2), it might be frustrating -- this scenario is part of the promise (and responsibility) of shared libraries.

Can't have your cake and eat it too.

mwilliamson11y ago

And that feature disparity includes security updates. If a library is updated with a security fix, you'll need to update everything that uses that library to get the fix, rather than just the shared dynamic library.

1 more reply

j / k navigate · click thread line to collapse

57 comments

zackmorris11y ago

deathanatos11y ago

> I wish Linux would replace dynamic libraries (especially ones referencing specific paths) with a system based on the library's hash.

Given this scheme, how would you distribute a security patch? Is each user of the library supposed to re-compile against the patched library?

Also, a program A depends on library B and library C v1.1. Library B also depends on C, but v1.2. Which gets used?

> More often than not, when I upgrade a library, I find myself having to upgrade my app’s code because so much has changed.

As a sibling notes, there are distros out there that do this. (They are not my preference, for the above reasons.)

zackmorris11y ago

Ya it would basically ignore names/versions and just use the hash as the identifier, a bit like BitTorrent.

noobermin11y ago

Tobu11y ago

https://www.debian.org/doc/manuals/maint-guide/advanced.en.h...

chongli11y ago

>Given this scheme, how would you distribute a security patch? Is each user of the library supposed to re-compile against the patched library?

No, this is the responsibility of the server which distributes binaries to users via the package manager.

1 more reply

witten11y ago

> I wish Linux would replace dynamic libraries (especially ones referencing specific paths) with a system based on the library's hash.

This is how NixOS works: http://nixos.org/

coherentpony11y ago

> The notion of being able to fix an app by merely upgrading a library it depends on has not worked out in practice.

vidarh11y ago

zokier11y ago

And if developers would just write bug-free code then also life would be much simpler. Sadly neither of these things is going to happen, so we need to deal with it somehow.

ultramancool11y ago

Cool idea. Not enough people seem aware of executable packers like UPX http://upx.sourceforge.net/ though.

Lerc11y ago

The downside to tools like UPX is that the executable code is actively transformed on load. This limits the ability to use shared memory for multiple concurrent executions of the same executable.

GeorgeTirebiter11y ago

1 more reply

rab_oof11y ago

Ah, and pklite pro and lzexe from the olden days for winDOwS shit once upon a time.

If anyone wants to compile a Linux system from scratch, try LFS and hackaround with static linking. It may take patches, extra flags to get what you want.

Hope Static Linux scales, because it's easier to upgrade static programs without dependency hell but the increased memory usage of duplicated code might not be so great of a tradeoff.

    /bin/static
    /bin/[ -> /bin/static
    /bin/false -> /bin/static
    /bin/true -> /bin/static

(Probably want to use hard links also)

justincormack11y ago

Crunchgen [1] does that static compile, merge and symbol rename trick for you

[1] http://netbsd.gw.com/cgi-bin/man-cgi?crunchgen++NetBSD-curre...

voltagex_11y ago

Yes UPX is cool, but I don't think it's 100% compatible.

ultramancool11y ago

You can usually set UPX options to fix that kind of thing. You may have stripped necessary linking information. I usually use it on statically linked things though, so no issue there.

stonogo11y ago

I have been keeping an eye on this "project" for years and have yet to see anything come of it except lightning talks.

suckless.org seems to focus on their web browser and their xterm clone these days, judging by the listserv traffic.

saysjonathan11y ago

If you're interested in the idea, definitely check it out.

0: http://morpheus.2f30.org 1: http://morpheus.2f30.org/0.0/packages/x86_64/

vezzy-fnord11y ago

A similarly interesting project is the musl-based Sabotage Linux with its own tiny but concurrent package manager: https://github.com/sabotage-linux/sabotage

2 more replies

agumonkey11y ago

Never even heard of it, amazing. #xmas

TsukasaUjiie11y ago

dfc11y ago

  > Because dwm is customized through editing its source code, it’s
  > pointless to make binary packages of it. This keeps its userbase
  > small and elitist. No novices asking stupid questions.

I never understood why the authors of dwm thought this was a "nice" feature of configuration via source code.

ertdfgcb11y ago

They don't mind if DWM is only ever used by a handful of people if only a handful of people agree with the principles behind it.

vidarh11y ago

sparkie11y ago

The other kinds of configuration aren't that much better.

The way configuration works on Windows and Mac is largely the same, except you wrap a GUI around the text file to handle the validation of inputs, which is a slight improvement over text input.

leakybucket11y ago

An advantage of dynamic libraries is that the memory used to hold the library's executable pages can be shared across processes. So using static only binaries will lead to less free memory on the OS.

Animats11y ago

Linking to a shared library brings in and initializes the whole library, even if you only need one function from it. So you tend to get stuff paged in during load that never gets used.

icebraining11y ago

That's not all that common

Isn't it? Usually distros target their packages to a single library version, and often people run suites (Gnome, KDE, etc) that use a similar set of libraries in their different processes.

1 more reply

gnufx11y ago

curlyquote11y ago

Static linking OpenSSL is probably not a good idea

AjithAntony11y ago

You mean for security patches? i.e rebuilding all your binaries, instead of just openssl's shared libs

w8rbt11y ago

proveanegative11y ago

Are there any static BSDs?

justincormack11y ago

[1] https://www.netbsd.org/docs/guide/en/chap-build.html#chap-bu...

zzzcpan11y ago