undefined | Better HN

0 pointsandrekandre3y ago0 comments

  > Linking an intermediate form then writing machine code (in elf if you like) from that single blob is better. I'm pretty sure it can be done in lower memory overhead than linking machine code if you're so inclined.

do you mean its better for build performance or for the final output?

0 comments

JonChesterfield3y ago

Better across the board. Currently we translate to machine code and tables of extra data saying how to patch said code, store it in an elf/coff/other, then read it back in and use the extra data to work out what to do with it.

If instead you keep it in IR and combine that, e.g. llvm-link style, then emit machine code from it you don't need the tables of information for the linker. You can transform debug info before it has been compressed into dwarf. Optionally optimise across translation units. Then finally emit the same format the loader expects.

That should be simpler tooling (compiler & linker don't need the side channel relocation stuff, the linker doesn't need to disassemble and recombine dwarf), faster residual program, faster build times.

It's not totally theoretical either - amdgpu on llvm only passes a single file to the linker at a time, but lld is multi-architecture so doesn't get to delete the combining files code. A machine learning toolchain gave up on calling the linker at one point in favour of doing the data layout from the compiler directly. The latter because lld is/was too annoying to use as a library iiuc.

andrekandreOP3y ago

interesting, sounds promising, i guess the reason why we do the current way is just simplicity or just local optimum kind of situation?

JonChesterfield3y ago

The current system lends itself well to memory constrained systems. Each source/ASM file gets turned into a single binary with just enough metadata to paste it together with another one. It caches nicely too.

Whole program optimisation is difficult to do without enough memory. Given machine code linking as how things are done, things like debug info got spliced into the same model. Shared libraries are a similar sort of incremental memory saving scheme.

1 more reply

j / k navigate · click thread line to collapse

0 pointsandrekandre3y ago0 comments

  > Linking an intermediate form then writing machine code (in elf if you like) from that single blob is better. I'm pretty sure it can be done in lower memory overhead than linking machine code if you're so inclined.

do you mean its better for build performance or for the final output?

0 comments

JonChesterfield3y ago

andrekandreOP3y ago

interesting, sounds promising, i guess the reason why we do the current way is just simplicity or just local optimum kind of situation?

JonChesterfield3y ago

1 more reply

j / k navigate · click thread line to collapse