Hello World (opens in new tab)

(thecoder08.github.io)

509 pointsfbrusch1y ago79 comments

79 comments

I got bored the other day and tried to achieve something similar on MacOS with Rust:

    #![no_std]
    #![no_main]
    
    use core::panic::PanicInfo;
    
    #[panic_handler]
    fn panic_handler(_panic: &PanicInfo<'_>) -> ! {
        // TODO: write panic message to stderr
        write(2, "Panic occured\n".as_bytes()); // TODO: panic location + message
        unsafe { sc::syscall!(EXIT, 255 as u32) };
        loop {}
    }
    
    fn write(fd: usize, buf: &[u8]) {
        unsafe {
            sc::syscall!(WRITE, fd, buf.as_ptr(), buf.len());
        }
    }
    
    #[no_mangle]
    pub extern "C" fn main() -> u32 {
        write(1, "Hello, world!\n".as_bytes());
        return 0;
    }

Then I inspected the ELF output in Ghidra. No matter what it was about ~16kb. I'm sure some code golf could be done to get it done (which has obviously been done + written about + documented before)

ChrisSD1y ago

A similar program on Windows is 3072 bytes. I compiled it using:

    rustc hello.rs -C panic=abort -C opt-level=3 -C link-arg=/entry:main

Here's the program:

    #![no_std]
    #![no_main]

    #[panic_handler]
    fn panic_handler(_panic: &core::panic::PanicInfo<'_>) -> ! {
        unsafe { ExitProcess(111) };
    }

    #[no_mangle]
    pub extern "C" fn main() -> u32 {
        let msg = b"Hello, world!\n";
        unsafe {
            let stdout = GetStdHandle(-11);
            let mut written = 0;
            WriteFile(
                stdout,
                msg.as_ptr(),
                msg.len() as u32,
                &mut written,
                core::ptr::null_mut(),
            );
        }
        0
    }

    #[link(name = "kernel32")]
    extern "system" {
        fn ExitProcess(uExitCode: u32) -> !;
        fn GetStdHandle(nStdHandle: i32) -> isize;
        fn WriteFile(
            hFile: isize,
            lpBuffer: *const u8,
            nNumberOfBytesToWrite: u32,
            lpNumberOfBytesWritten: *mut u32,
            lpOverlapped: *mut (),
        ) -> i32;
    }

I didn't bother much with the panic handler because there's no reason to in hello world. Though the binary contains a fair bit of padding still so it could have a few more things added to it without increasing the size. Alternatively I could shrink it a bit further by doing crimes but I'm not sure there's much point.

It may be worth noting that the associated pdb (aka debug database) is 208,896 bytes.

ChrisSD1y ago

Ok, I committed some mild linker crimes and got the same program down to 800 bytes.

   rustc hello.rs -C panic=abort -C opt-level=3 -C link-args="/ENTRY:main /DEBUG:NONE /EMITPOGOPHASEINFO /EMITTOOLVERSIONINFO:NO /ALIGN:16"

1 more reply

qweqwe141y ago

To achieve the smallest size, you have to ditch main entirely and use _start, along with passing certain linker flags not to align sections. See https://darkcoding.net/software/a-very-small-rust-binary-ind...

You can easily get around 500 bytes this way

0823498723498721y ago

it's fun to golf, but how big are pages these days, anyway?

(and if you're using a language with a stack, your executable probably ultimately loads as at least two pages: r/o and r/w)

anyfoo1y ago

16kB on a Mac, so that was probably what they were running into. You effectively can’t make a program whose code occupies less than 16kB.

clemiclemen1y ago

The min-sized-rust project [1], which you may already be aware of, has a lot of optimisations to decrease the size of a Rust binay. I think with all the optimisations, at the end, it was around 8kb for the hello world.

[1]: https://github.com/johnthagen/min-sized-rust

saagarjha1y ago

XNU will not load any Mach-Os that are smaller than a page, so unfortunately there is not much fun to be had on that platform.

speed_spread1y ago

Rust or not, 16kb is entirely satisfactory. I mean, it fits in a Commodore 64's memory!

_the_inflator1y ago

Compared to a Hello World program on C64 written in assembler probably not. ;)

I need roughly 24 bytes. 16kb means 16.384 bytes. I can think of better usage of $4000 space in memory, FLI for example.

1 more reply

praptak1y ago

There's another rabbit hole which Musl seems to have skipped. Using `syscall` directly is not all there is to calling system functions on Linux.

The "better behaved" way is to call vDSO. It's a magic mini-library which the kernel automatically maps into your address space. Thus the kernel is free to provide you with whatever code it deems optimal for doing a system call.

In particular some of the system calls might be optimized away and not require the `syscall` at all because they are executed in the userspace. Historically you could expect vDSO to choose between different mechanisms of calling the kernel (int 0x80, sysenter).

https://man7.org/linux/man-pages/man7/vdso.7.html

LegionMammal9781y ago

It's only on 32-bit x86 that the vDSO contains the generic fast syscall shim. On x86-64, the standard method for syscalls is the SYSCALL instruction, and the vDSO contains only a few time- and SGX-related functions.

qweqwe141y ago

Also see this blog post, which compares "Hello World" programs in different languages by overhead: https://drewdevault.com/2020/01/04/Slow.html

Follow-up: https://drewdevault.com/2020/01/08/Re-Slow.html

This legendary blogpost makes the smallest Linux program (the program simply exits with status 42): https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...

You can also find the smallest "Hello World" program on that website.

bkallus1y ago

This almost entirely skips the role of the dynamic linker, which is arguably the true entry point of the program.

If you are interested in that argument, see https://gist.github.com/kenballus/c7eff5db56aa8e4810d39021b2....

susam1y ago

In case there are any DOS enthusiasts out here, a "hello, world" program written in assembly/machine code in DOS used to be as small as 23 bytes: https://github.com/susam/hello

Out of these 23 bytes, 15 bytes are consumed by the dollar-terminated string itself. So really only eight bytes of machine code that consists of four x86 instructions.

cancerhacker1y ago

I liked and appreciated this, two points I'd like to make: you should disable optimizations and whatever inlining caused printf to become puts (or, alternatively, write the hello world to use puts directly), second would be to break your compile step into the 4 real parts: preprocess, compile, assemble, link. Or add --save-temps to the cc line and describe the various files created. There's a lot less magic involved if you can see the pipeline.

Syntaf1y ago

This reminds me of one of my favorite CS assignments in college for a systems programming class:

    > Given a hello world C++ snippet, submit the smallest possible compiled binary

I remember using tools like readelf and objdump to inspect the program and slowly rip away layers and compiler optimizations until I ended up with the smallest possible binary that still outputted "hello world". I googled around and of course found someone who did it likely much better than any of us students could have ever managed [1]

[1]: https://www.muppetlabs.com/%7Ebreadbox/software/tiny/teensy....

eddd-ddde1y ago

Does it even matter that the snippet was c++? Couldn't you just get the smallest binary that prints hello world and argue that they are semantically equivalent?

That should be like 10 x86 instructions tops, plus the string data.

LegionMammal9781y ago

I'd note that the smallest instruction sequence doesn't necessarily correspond to the smallest executable file, since you might be able to combine some instruction data with header data to make the file smaller.

For instance, when I tried to make the smallest x86-64 Hello World program (https://tmpout.sh/3/22.html), I ended up lengthening it from 11 to 15 instructions, while shortening the ultimate file from 86 to 77 bytes.

nsguy1y ago

IIRC there's some setup code that's injected before main is called and can be turned off in most compilers (_start?). Should be possible to get down to essentially the same thing you can do with an assembler. C (and so C++ for the most part) is essentially a portable assembler. You can probably also just inline those 10 x86 instructions directly.

1 more reply

greenbandit1y ago

I don't think the smallest binary was the goal but rather the smallest compiled binary. I assume they were restricted to a specific compiler as well.

userbinator1y ago

At least on DOS, it's more like 4 instructions:

    mov ah, 9
    mov dx, 108
    int 21
    ret
    db "hello world!$"

Syntaf1y ago

Our curriculum was c++, so it was familiar territory for the students in terms of both the language and the compiler but yeah you can likely do this exercise with any compiled language.

riazrizvi1y ago

Seems like it would miss the point of the exercise, which was to learn what a C++ compiler puts into an executable. Though I do realize that if there was a magic wand to get a grade without doing the work, some of us would take it.

1vuio0pswjnm71y ago

If it is a favourite task, then why do I not see more folks creating the smallest possible binaries, for programs other than "hello world". Is it truly enojyable. It is for me because I like saving space on the computers I own. But I am seeing many, many 10+, 20+, 50+ and 100+ MiB programs being written by others in recent times. Some are created in commercial environments, for commercial purposes, but others are allegedly written for the enjoyment. Is it not enjoyable to write small programs.

simonw1y ago

Because building small binaries takes more human effort than building large ones. Most developers would rather invest that effort in building other stuff (like user-facing features) than invest it in shrinking their binaries.

noobermin1y ago

That post is deserves its own post on HN, what a ride.

norir1y ago

> I’m sorry the ending maybe wasn’t as satisfying as you hoped. I’m happy someone found this interesting. I’m not quite sure why I wrote this, but it’s now after midnight so I should get some sleep.

This was actually a perfect ending to this piece.

delta_p_delta_x1y ago

Sadly, like most 'hello world' deep dives, the author stops at the `write` syscall and glosses over the rest. Everything before the syscall essentially boils down to `printf` calling `puts` calling `write`—it's one function call after another forwarding the `char const*` through, with some book-keeping. In my opinion, not the most interesting.

What comes after the syscall is where everything gets very interesting and very very complicated. Of course, it also becomes much harder to debug or reverse-engineer because things get very close to the hardware.

Here's a quick summary, roughly in order (I'm still glossing over; each of these steps probably has an entire battalion of software and hardware engineers from tens of different companies working on it, but I daresay it's still more detailed than other 'tours through 'hello world'):

  - The kernel performs some setup setup to pipe the `stdout` of the hello world process into some input (not necessarily `stdin`; could be a function call too) of the terminal emulator process. 
  - The terminal emulator calls into some typeface rendering library and the GPU driver to set up a framebuffer for the new output. 
  - The above-mentioned typeface rendering library also interfaces with the GPU driver to convert what was so far just a one-dimensional byte buffer into a full-fledged two-dimensional raster image:
    - the corresponding font outlines for each character byte is loaded from disk;
    - each outline is aligned into a viewport; 
    - these outlines are resized, kerning and font metrics applied from the font files set by the terminal emulator;
    - the GPU rasterises and anti-aliases the viewport (there are entire papers and textbooks written on these two topics alone). Rasterisation of font outlines may be done directly in hardware without shaders because nearly all outlines are quadratic Bézier splines.
  - This is a new framebuffer for the terminal emulator's window, a 2D grid containing (usually) RGB bytes.
  - The windowing manager takes this framebuffer result and *composits* with the window frame (minimise/maximise/close buttons, window title, etc) and the rest of the desktop—all this is done usually on the GPU as well.
    - If the terminal emulator window in question has fancy transparency or 'frosted glass' effects, this composition applies those effects with shaders here.
  - The resultant framebuffer is now at the full resolution and colour depth of the monitor, which is then packetised into an HDMI or DisplayPort signal by the GPU's display-out hardware, depending on which is connected.
  - This is converted into an electrical signal by a DAC, and the result piped into the cable connecting the monitor/internal display, at the frequency specified by the monitor refresh rate.
    - This is muddied by adaptive sync, which has to signal the monitor for a refresh instead of blindly pumping signals down the wire
  - The monitor's input hardware has an ADC which re-converts the electrical signal from the cable into RGB bytes (or maybe not, and directly unwraps the HDMI/DP packets for processing into the pixel-addressing signal, I'm not a monitor hardware engineer).
  - The electrical signal representing the framebuffer is converted into signals for the pixel-addressing hardware, which differs depending on the exact display type—whether LCD, OLED, plasma, or even CRT. OLED might be the most complicated since each *subpixel* needs to be *individually* addressed—for a 3840 × 2400 WRGB OLED as seen on LG OLED TVs, this is 3840 × 2400 × 4 = 36 864 000 subpixels, i.e. nearly 37 million pixels.
  - The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result.

Note that all this happens at most within the frame time of a monitor refresh, which is 16.67 ms for 60 Hz.

rossant1y ago

> The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result.

Nice explanation but you stopped at the human's visual system which is where everything gets very interesting and very very complicated. :)

[1] https://en.wikipedia.org/wiki/Visual_system

rpigab1y ago

Assuming there's a vaccuum between the screen and your eyes (perfectly spherical), of course.

Otherwise, its gets very interesting and very very complicated. :)

thedatamonger1y ago

thank you for that. I startled the dog with that laugh :)

Mateon11y ago

If you're interested in dives that deep, you might like Gynvael Coldwind's hello world in Python on Windows dive [1]. Goes through CPython internals, Windows conhost, font rasterization, and GPU rendering, among others.

[1]: https://gynvael.coldwind.pl/?id=754

wonnage1y ago

Most of this stuff is irrelevant to the program itself, e.g you could've piped the output to /dev/null and none of this would happen.

delta_p_delta_x1y ago

Fair enough.

However, the point of a hello world program is to introduce programming to beginners, and make a computer do something one can visibly see. I daresay this is made moot if you pipe it into /dev/null. I could then replace 'hello world' in the title with 'any program x that logs to `stdout`', and it wouldn't reach HN's front page.

In the same vein is this idea of 'a trip of abstractions'—I don't know about you, but I always found most of them very unsatisfying as they always stop at the system call, whether Linux or Windows. It really is afterwards where things get interesting, you can't deny that.

hifromwork1y ago

It also skips what happens before _start, for example how a process is born (execve on linux is pretty weird), how the program is loaded into memory (including binfmt_* and the allmighty binfmt_misc), relocations, exception handling frames, sections, and ELF loader in general, allocation of OS resources (including necessary malloc) and probably much more.

johnfn1y ago

I'm surprised this didn't get more votes. I thought it was fantastic!

stevefolta1y ago

> Unlike python, however, you can’t just call an interpreter to run this program.

Sure you can: "tcc -run hello.c". Okay, technically that's an in-memory compiler rather than an interpreter.

For extra geek points, have your program say "Hellorld" instead of "Hello world".

qznc1y ago

Another interesting fact: Write a Hello World program in C++, run the preprocessor on it (g++ -E), then count the lines of the output.

I just tried and it shows me 33738 lines (744 lines for C btw).

In a language like C++, even Hello World uses like half of all the language features.

nsguy1y ago

Don't include the standard headers and just put it the definition for the symbols you're using. With std::cout you might end up pulling your hair out to find everything but I can't imagine it'd be more than a few dozen lines... Not gonna try ;)

With C you just need a definition for printf instead of including stdio. In the old days you'd get by without even defining it at all.

qznc1y ago

In C, you can get away with just this (no includes or definitions):

    void main() { puts("Hello, World!"); }

You get a warning, but it is fine.

flohofwoe1y ago

> In the old days you'd get by without even defining it at all.

GCC and Clang have a `__builtin_printf()` instead, quite useful for adhoc printf-debugging without having to include stdio.h. Under the hood it just resolves to a regular printf() or puts() stdlib call though.

ColonelPhantom1y ago

Arguably, most of that is probably dead code included from <iostream>. Most of it is likely never called for a simple Hello World.

Similarly, in C, you can just give the definition of printf and omit the include of stdio.h, which also saves a ton of preprocessed lines.

bitwize1y ago

One of the things I really like about the "hello, world" example in K&R is it's used as a template to show the reader what a complete, working (if bare minimum) C program looks like. All the relevant parts are labeled including the #include preprocessor directive, function definition, function call (to printf), etc. There are a few paragraphs explaining these parts in greater detail. Finally some explanation is provided about how one might compile and run this program, using Unix as an example (the authors are pretty biased toward that system).

This article is very much in that same friendly, explanatory spirit, although obviously it goes into greater depth and uses a modern system.

king_geedorah1y ago

> the authors are pretty biased toward that system

Thanks for the chuckle.

sohzm1y ago

This reminded me of this video "Advanced Hello World in Zig - Loris Cro" https://youtu.be/iZFXAN8kpPo although not a equivalent comparison but still a intresting watch

baudaux1y ago

GNU hello program is quite complex as well

https://www.gnu.org/software/hello/

mseepgood1y ago

The source code: https://git.savannah.gnu.org/cgit/hello.git/tree/src/hello.c

Bengalilol1y ago

Just saying hi ! (and thanks for the read, got me back to the old days of ASM One and SEKA on Amiga, trying to clean up my memory dust pile)

snvzz1y ago

I'd look at asmtwo and asmpro today.

mseepgood1y ago

Could have continued with following the syscall code in the kernel.

deepsun1y ago

Would be cool to see the same for statically-linked HelloWorld.

ben_w1y ago

I've been thinking recently, we might have too many layers of abstraction in our systems.

And I didn't even know about most of the ones in this post.

nomel1y ago

> And I didn't even know about most of the ones in this post.

I think that's the answer to the question of "why do we have so many". It's a great thing you don't have to know about them. Go down a layer, and the people working there will think it's a great thing they don't have to worry about the abstraction below. Software development is, currently, a human task, so the human needs necessarily structure it.

I can't comment on web development...

mttpgn1y ago

Just because the number of abstraction layers can be reduced doesn't mean they need to be. You might gain back some CPU cycles, some milliseconds of execution time. But the tradeoffs of maintainability, legibility, and developer quality-of-life may, in the long run, reintroduce abstraction layers of some other type back into the overall SDLC.

ben_w1y ago

> But the tradeoffs of maintainability, legibility, and developer quality-of-life

Are in fact the things I think have become worse from the abstractions.

Well, the recent abstractions. I like the ones that were widespread until about 2018 or so.

2 more replies

freeone30001y ago

Do some embedded work, even at the c-for-esp32 layer, and you’ll start to see the benefits of having an OS.

xandrius1y ago

Yep, it seems common for people who haven't actually experienced something from the past to wish it was still like then, not understanding why people decided the trade was worth it.

You might save some bytes but boy how annoying and complicated it can get.

2 more replies

ctrw1y ago

I can see the benefits of an os. I fail to see the benefits of a browser pretending to be another os on top of the os.

kenneth1y ago

What I think is shocking is that 99% of the software engineering world these days would have zero ability to comprehend this explanation. In my opinion, you could not be a real programmer without the ability to understand what your program is under the hood. You might be able to get away with it, but you're just faking it.

And yet, 99% of the people I've ever seen in the industry have no idea how any of the code they write works.

I used to ask a simple interview question, I wanted to see if potential hires could explain what a pointer or memory was. Few ever could.

gglitch1y ago

Like the man said, “If you wish to make an apple pie from scratch, you must first invent the universe.”

bitwize1y ago

Space is filled with a network of wormholes... https://www.youtube.com/watch?v=zSgiXGELjbc

Retr0id1y ago

> All modern big and important programs that make a computer work are written this way [AoT-compiled to native code].

This is the conventional wisdom, but it's increasingly not true.

nomel1y ago

I think it’s definitely true if taken literally, “make a computer work”. From Python interpreter, to browsers, to OS, and most optimized libraries.

nektro1y ago

loved it! except the final conclusion

vivzkestrel1y ago

there was another post like this on HN that was actually illustrated in a better manner, anyone got links to it?

j / k navigate · click thread line to collapse

79 comments

MuffinFlavored1y ago

I got bored the other day and tried to achieve something similar on MacOS with Rust:

    #![no_std]
    #![no_main]
    
    use core::panic::PanicInfo;
    
    #[panic_handler]
    fn panic_handler(_panic: &PanicInfo<'_>) -> ! {
        // TODO: write panic message to stderr
        write(2, "Panic occured\n".as_bytes()); // TODO: panic location + message
        unsafe { sc::syscall!(EXIT, 255 as u32) };
        loop {}
    }
    
    fn write(fd: usize, buf: &[u8]) {
        unsafe {
            sc::syscall!(WRITE, fd, buf.as_ptr(), buf.len());
        }
    }
    
    #[no_mangle]
    pub extern "C" fn main() -> u32 {
        write(1, "Hello, world!\n".as_bytes());
        return 0;
    }

Then I inspected the ELF output in Ghidra. No matter what it was about ~16kb. I'm sure some code golf could be done to get it done (which has obviously been done + written about + documented before)

ChrisSD1y ago

A similar program on Windows is 3072 bytes. I compiled it using:

    rustc hello.rs -C panic=abort -C opt-level=3 -C link-arg=/entry:main

Here's the program:

    #![no_std]
    #![no_main]

    #[panic_handler]
    fn panic_handler(_panic: &core::panic::PanicInfo<'_>) -> ! {
        unsafe { ExitProcess(111) };
    }

    #[no_mangle]
    pub extern "C" fn main() -> u32 {
        let msg = b"Hello, world!\n";
        unsafe {
            let stdout = GetStdHandle(-11);
            let mut written = 0;
            WriteFile(
                stdout,
                msg.as_ptr(),
                msg.len() as u32,
                &mut written,
                core::ptr::null_mut(),
            );
        }
        0
    }

    #[link(name = "kernel32")]
    extern "system" {
        fn ExitProcess(uExitCode: u32) -> !;
        fn GetStdHandle(nStdHandle: i32) -> isize;
        fn WriteFile(
            hFile: isize,
            lpBuffer: *const u8,
            nNumberOfBytesToWrite: u32,
            lpNumberOfBytesWritten: *mut u32,
            lpOverlapped: *mut (),
        ) -> i32;
    }

It may be worth noting that the associated pdb (aka debug database) is 208,896 bytes.

ChrisSD1y ago

Ok, I committed some mild linker crimes and got the same program down to 800 bytes.

   rustc hello.rs -C panic=abort -C opt-level=3 -C link-args="/ENTRY:main /DEBUG:NONE /EMITPOGOPHASEINFO /EMITTOOLVERSIONINFO:NO /ALIGN:16"

1 more reply

qweqwe141y ago

You can easily get around 500 bytes this way

0823498723498721y ago

it's fun to golf, but how big are pages these days, anyway?

(and if you're using a language with a stack, your executable probably ultimately loads as at least two pages: r/o and r/w)

anyfoo1y ago

16kB on a Mac, so that was probably what they were running into. You effectively can’t make a program whose code occupies less than 16kB.

clemiclemen1y ago

[1]: https://github.com/johnthagen/min-sized-rust

saagarjha1y ago

XNU will not load any Mach-Os that are smaller than a page, so unfortunately there is not much fun to be had on that platform.

speed_spread1y ago

Rust or not, 16kb is entirely satisfactory. I mean, it fits in a Commodore 64's memory!

_the_inflator1y ago

Compared to a Hello World program on C64 written in assembler probably not. ;)

I need roughly 24 bytes. 16kb means 16.384 bytes. I can think of better usage of $4000 space in memory, FLI for example.

1 more reply

praptak1y ago

There's another rabbit hole which Musl seems to have skipped. Using `syscall` directly is not all there is to calling system functions on Linux.

https://man7.org/linux/man-pages/man7/vdso.7.html

LegionMammal9781y ago

qweqwe141y ago

Also see this blog post, which compares "Hello World" programs in different languages by overhead: https://drewdevault.com/2020/01/04/Slow.html

Follow-up: https://drewdevault.com/2020/01/08/Re-Slow.html

This legendary blogpost makes the smallest Linux program (the program simply exits with status 42): https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...

You can also find the smallest "Hello World" program on that website.

bkallus1y ago

This almost entirely skips the role of the dynamic linker, which is arguably the true entry point of the program.

If you are interested in that argument, see https://gist.github.com/kenballus/c7eff5db56aa8e4810d39021b2....

susam1y ago

In case there are any DOS enthusiasts out here, a "hello, world" program written in assembly/machine code in DOS used to be as small as 23 bytes: https://github.com/susam/hello

Out of these 23 bytes, 15 bytes are consumed by the dollar-terminated string itself. So really only eight bytes of machine code that consists of four x86 instructions.

cancerhacker1y ago

Syntaf1y ago

This reminds me of one of my favorite CS assignments in college for a systems programming class:

    > Given a hello world C++ snippet, submit the smallest possible compiled binary

[1]: https://www.muppetlabs.com/%7Ebreadbox/software/tiny/teensy....

eddd-ddde1y ago

Does it even matter that the snippet was c++? Couldn't you just get the smallest binary that prints hello world and argue that they are semantically equivalent?

That should be like 10 x86 instructions tops, plus the string data.

LegionMammal9781y ago

nsguy1y ago

1 more reply

greenbandit1y ago

I don't think the smallest binary was the goal but rather the smallest compiled binary. I assume they were restricted to a specific compiler as well.

userbinator1y ago

At least on DOS, it's more like 4 instructions:

    mov ah, 9
    mov dx, 108
    int 21
    ret
    db "hello world!$"

Syntaf1y ago

Our curriculum was c++, so it was familiar territory for the students in terms of both the language and the compiler but yeah you can likely do this exercise with any compiled language.

riazrizvi1y ago

1vuio0pswjnm71y ago

simonw1y ago

noobermin1y ago

That post is deserves its own post on HN, what a ride.

norir1y ago

This was actually a perfect ending to this piece.

delta_p_delta_x1y ago

  - The kernel performs some setup setup to pipe the `stdout` of the hello world process into some input (not necessarily `stdin`; could be a function call too) of the terminal emulator process. 
  - The terminal emulator calls into some typeface rendering library and the GPU driver to set up a framebuffer for the new output. 
  - The above-mentioned typeface rendering library also interfaces with the GPU driver to convert what was so far just a one-dimensional byte buffer into a full-fledged two-dimensional raster image:
    - the corresponding font outlines for each character byte is loaded from disk;
    - each outline is aligned into a viewport; 
    - these outlines are resized, kerning and font metrics applied from the font files set by the terminal emulator;
    - the GPU rasterises and anti-aliases the viewport (there are entire papers and textbooks written on these two topics alone). Rasterisation of font outlines may be done directly in hardware without shaders because nearly all outlines are quadratic Bézier splines.
  - This is a new framebuffer for the terminal emulator's window, a 2D grid containing (usually) RGB bytes.
  - The windowing manager takes this framebuffer result and *composits* with the window frame (minimise/maximise/close buttons, window title, etc) and the rest of the desktop—all this is done usually on the GPU as well.
    - If the terminal emulator window in question has fancy transparency or 'frosted glass' effects, this composition applies those effects with shaders here.
  - The resultant framebuffer is now at the full resolution and colour depth of the monitor, which is then packetised into an HDMI or DisplayPort signal by the GPU's display-out hardware, depending on which is connected.
  - This is converted into an electrical signal by a DAC, and the result piped into the cable connecting the monitor/internal display, at the frequency specified by the monitor refresh rate.
    - This is muddied by adaptive sync, which has to signal the monitor for a refresh instead of blindly pumping signals down the wire
  - The monitor's input hardware has an ADC which re-converts the electrical signal from the cable into RGB bytes (or maybe not, and directly unwraps the HDMI/DP packets for processing into the pixel-addressing signal, I'm not a monitor hardware engineer).
  - The electrical signal representing the framebuffer is converted into signals for the pixel-addressing hardware, which differs depending on the exact display type—whether LCD, OLED, plasma, or even CRT. OLED might be the most complicated since each *subpixel* needs to be *individually* addressed—for a 3840 × 2400 WRGB OLED as seen on LG OLED TVs, this is 3840 × 2400 × 4 = 36 864 000 subpixels, i.e. nearly 37 million pixels.
  - The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result.

Note that all this happens at most within the frame time of a monitor refresh, which is 16.67 ms for 60 Hz.

rossant1y ago

> The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result.

Nice explanation but you stopped at the human's visual system which is where everything gets very interesting and very very complicated. :)

[1] https://en.wikipedia.org/wiki/Visual_system

rpigab1y ago

Assuming there's a vaccuum between the screen and your eyes (perfectly spherical), of course.

Otherwise, its gets very interesting and very very complicated. :)

thedatamonger1y ago

thank you for that. I startled the dog with that laugh :)

Mateon11y ago

[1]: https://gynvael.coldwind.pl/?id=754

wonnage1y ago

Most of this stuff is irrelevant to the program itself, e.g you could've piped the output to /dev/null and none of this would happen.

delta_p_delta_x1y ago

Fair enough.

hifromwork1y ago

johnfn1y ago

I'm surprised this didn't get more votes. I thought it was fantastic!

stevefolta1y ago

> Unlike python, however, you can’t just call an interpreter to run this program.

Sure you can: "tcc -run hello.c". Okay, technically that's an in-memory compiler rather than an interpreter.

For extra geek points, have your program say "Hellorld" instead of "Hello world".

qznc1y ago

Another interesting fact: Write a Hello World program in C++, run the preprocessor on it (g++ -E), then count the lines of the output.

I just tried and it shows me 33738 lines (744 lines for C btw).

In a language like C++, even Hello World uses like half of all the language features.

nsguy1y ago

With C you just need a definition for printf instead of including stdio. In the old days you'd get by without even defining it at all.

qznc1y ago

In C, you can get away with just this (no includes or definitions):

    void main() { puts("Hello, World!"); }

You get a warning, but it is fine.

flohofwoe1y ago

> In the old days you'd get by without even defining it at all.

ColonelPhantom1y ago

Arguably, most of that is probably dead code included from <iostream>. Most of it is likely never called for a simple Hello World.

Similarly, in C, you can just give the definition of printf and omit the include of stdio.h, which also saves a ton of preprocessed lines.

bitwize1y ago

This article is very much in that same friendly, explanatory spirit, although obviously it goes into greater depth and uses a modern system.

king_geedorah1y ago

> the authors are pretty biased toward that system

Thanks for the chuckle.

sohzm1y ago

This reminded me of this video "Advanced Hello World in Zig - Loris Cro" https://youtu.be/iZFXAN8kpPo although not a equivalent comparison but still a intresting watch

baudaux1y ago

GNU hello program is quite complex as well

https://www.gnu.org/software/hello/

mseepgood1y ago

The source code: https://git.savannah.gnu.org/cgit/hello.git/tree/src/hello.c

Bengalilol1y ago

Just saying hi ! (and thanks for the read, got me back to the old days of ASM One and SEKA on Amiga, trying to clean up my memory dust pile)

snvzz1y ago

I'd look at asmtwo and asmpro today.

mseepgood1y ago

Could have continued with following the syscall code in the kernel.

deepsun1y ago

Would be cool to see the same for statically-linked HelloWorld.

ben_w1y ago

I've been thinking recently, we might have too many layers of abstraction in our systems.

And I didn't even know about most of the ones in this post.

nomel1y ago

> And I didn't even know about most of the ones in this post.

I can't comment on web development...

mttpgn1y ago

ben_w1y ago

> But the tradeoffs of maintainability, legibility, and developer quality-of-life

Are in fact the things I think have become worse from the abstractions.

Well, the recent abstractions. I like the ones that were widespread until about 2018 or so.

2 more replies

freeone30001y ago

Do some embedded work, even at the c-for-esp32 layer, and you’ll start to see the benefits of having an OS.

xandrius1y ago

Yep, it seems common for people who haven't actually experienced something from the past to wish it was still like then, not understanding why people decided the trade was worth it.

You might save some bytes but boy how annoying and complicated it can get.

2 more replies

ctrw1y ago

I can see the benefits of an os. I fail to see the benefits of a browser pretending to be another os on top of the os.

kenneth1y ago

And yet, 99% of the people I've ever seen in the industry have no idea how any of the code they write works.

I used to ask a simple interview question, I wanted to see if potential hires could explain what a pointer or memory was. Few ever could.

gglitch1y ago

Like the man said, “If you wish to make an apple pie from scratch, you must first invent the universe.”

bitwize1y ago

Space is filled with a network of wormholes... https://www.youtube.com/watch?v=zSgiXGELjbc

Retr0id1y ago

> All modern big and important programs that make a computer work are written this way [AoT-compiled to native code].

This is the conventional wisdom, but it's increasingly not true.

nomel1y ago

I think it’s definitely true if taken literally, “make a computer work”. From Python interpreter, to browsers, to OS, and most optimized libraries.

nektro1y ago

loved it! except the final conclusion

vivzkestrel1y ago

there was another post like this on HN that was actually illustrated in a better manner, anyone got links to it?

j / k navigate · click thread line to collapse