Bugs in Hello World (opens in new tab)

(blog.sunfishcode.online)

527 pointssizediterable4y ago252 comments

252 comments

You all joke that this doesn’t happen in practice, but something like this literally just bit me and it took me a few too many minutes to figure out what was going on.

I use a bash script as my BROWSER which calls another bash script to launch or communicate with my browser that I run inside a container. The script that my BROWSER script calls has some debug output that it prints to stderr.

I use mutt as my email client and urlscan [0] to open URLs inside emails. Urlscan looks at my BROWSER environment variable and thus calls my script to open whatever URL I target. Some time recently, the urlscan author decided to improve the UX by hiding stderr so that it wouldn’t pollute the view, and so attempted to pipe it to `/dev/null`. I guess their original code to do this wasn’t quite correct and it ended up closing the child processes’ stderr.*

I generally use `set -e` (errexit) because I want my scripts to fail if any command fails (I consider that after an unhandled failure the script’s behavior is undefined, some other people disagree and say you should never use `set -e` outside of development, but I digress). My BROWSER scripts are no exception.

While my scripts handle non-zero returns for most things that can go wrong, I never considered that writing log messages to stdout or stderr might fail. But it did, which caused the script to die before it was able to launch my browser. For a few weeks I wasn’t able to use urlscan to open links. I was too lazy to figure out what was wrong, and when I did it took me a while because I looked into every possibility except this one.

Luckily this wasn’t a production app. But I know now it could just as feasibly happen in production, too.

I opened an issue[1] and it was fixed very quickly. I love open source!

*No disrespect to urlscan, it’s an awesome tool and bugs happen to all of us!

[0]: https://github.com/firecat53/urlscan

[1]: https://github.com/firecat53/urlscan/issues/122

underdeserver4y ago

> I use a bash script as my BROWSER which calls another bash script to launch or communicate with my browser that I run inside a container.

I'm not sure return codes are the source of your troubles...

xelxebar4y ago

Interesting bug you found!

It sounds our sensibilities are similar regarding cli and tool usage. This is a side note, but as someone who used to use "Bash strict mode" in all my scripts, I'm now a bit bearish on `set -e`, mainly due to the subtle caveats. If you're interested, the link below has a nice (and long) list of potentially surprising errexit gotchas:

https://mywiki.wooledge.org/BashFAQ/105

(The list begins below the anecdote.)

shikoba4y ago

> some other people disagree and say you should never use `set -e` outside of development

I'm really interested. What are their arguments? And how do they handle errors?

bombcar4y ago

See https://mywiki.wooledge.org/BashFAQ/105 for examples of side effects.

I think the idea is you use set -e during development to find where you should catch errors, but in production you may want it off to reduce strange side-effects (or explicitly check for success in the way you expect; so not that the command returned 0 but that the file it made exists and is the right length, etc).

deathanatos4y ago

> but in production you may want it off to reduce strange side-effects

Having -e set is to reduce strange side-effects, by having the script fail, instead of plowing headlong into the land of undefined/unexpected behavior.

> See https://mywiki.wooledge.org/BashFAQ/105 for examples of side effects.

The `if` bit should be well-known if you're writing bash. The pipe bit is unfortunate, and is why -o pipefail is recommended, too. Or, just writing in a sane language that isn't going to page you in the middle of the night.

shikoba4y ago

> not that the command returned 0 but that the file it made exists and is the right length

If a command returns 0 when it didn't really do its job. Shouldn't we fix the command instead of the script?

1 more reply

tedunangst4y ago

That seems like a really weak argument. Sometimes set -e won't catch an error, therefore it's better to let all errors slip through? "You're supposed to handle every error." Yeah, okay, set -e doesn't interfere with that.

necrotic_comp4y ago

I'm actually really curious about this setup. Can you go into a bit more detail about how it works ?

HellsMaddy4y ago

I run Firefox inside a systemd-nspawn[0] container. I wrote a little wrapper around systemd-nspawn that I call arch-lwc[1] which kinda mimics the docker CLI. I have another script to coordinate the Firefox-specific stuff.

[0]: https://wiki.archlinux.org/title/Systemd-nspawn

[1]: https://github.com/b0o/arch-lwc

unqueued4y ago

Thanks for sharing! I wish this functionality was better exposed, it is such a game changer. I need to clean up and publish some of my scripts. I like to use firejail, xpra, and I'm trying to improve btrfs ephemeral subvolumes for my sandboxes.

xlii4y ago

I’m disappointed. I expected some obscure edgecase (like “Main is usually a function…” [1]) but instead that’s about scope handling, contract design and responsibility shift.

“Hello world” method simply calls an API to a text interface. It uses simple call, to a simple interface that is expected to be ever present. I don’t find any bug there. It won’t work if such interface isn’t available, is blocked or doesn’t exist. It won’t work on my coffee grinder nor on my screwdriver. It won’t work on my Arduino because there is no text interface neither.

Of course, one could argue that user might expect you to handle that error. That’s all about contracts and expectation. How should I deal with that? Is the “Hello world” message such important that the highest escalated scenario should be painted on the sky? I can imagine an awkward social game where we throw each other obscure challenges and call it a bug.

It’s nitpicking that even such simple code might fail and I get it. It will also fail on OOM, faulty hardware or if number of the processes on the machine hit the limit. Maybe some joker replaced bindings and it went straight to 3D printer which is out of material? _My expectations_ were higher based on the title.

Now allow me to excuse myself, I need to write an e-mail to my keyboard manufacturer because it seems like it has a bug which prevents it from working when slightly covered in liquid coffee.

[1]: http://jroweboy.github.io/c/asm/2015/01/26/when-is-main-not-...

matheusmoreira4y ago

I also had higher expectations after reading the title and was disappointed when I realized it was about failure to handle all possible system call results. I thought it was gonna be a bug in the C standard library or something.

I still agree with the author though. This is a serious matter and it seems most of the time the vast amount of complexity that exists in seemingly simple functionality is ignored.

Hello world is not "simply" calling a text interface API. It is asking the operating system to write data somewhere. I/O is exactly where "simple" programs meet the real world where useful things happen and it's also where things often get ugly.

Here's all the stuff people need to think about in order to handle the many possible results of a single write system call on Linux:

  long result = write(1, "Hello", sizeof("Hello") - 1);

  switch (result) {
    case -EAGAIN:
      /* Occurs only if opened with O_NONBLOCK. */
      break;
    case -EWOULDBLOCK:
      /* Occurs only if opened with O_NONBLOCK. */
      break;
    case -EBADF:       
      /* File descriptor is invalid or wasn't opened for writing. */
      break;
    case -EDQUOT:
      /* User's disk quota reached. */
      break;
    case -EFAULT:
      /* Buffer points outside accessible address space. */
      break;
    case -EFBIG:
      /* Maximum file size reached. */
      break;
    case -EINTR:
      /* Write interrupted by signal before writing. */
      break;
    case -EINVAL:
      /* File descriptor unsuitable for writing. */
      break;
    case -EIO:
      /* General output error. */
      break;
    case -ENOSPC:
      /* No space available on device. */
      break;
    case -EPERM:
      /* File seal prevented the file from being written. */
      break;
    case -EPIPE:
      /* The pipe or socket being written to was closed. */
      break;
  }

Some of these are unlikely. Some of these are irrelevant. Some of these are very important. Virtually all of them seem to be routinely ignored, especially in text APIs.

Perseids4y ago

And specifically no space left on device is a very common error that is also very commonly handled badly. Happened to me yesterday and the error messages I got were unhelpful or non-existent. In Firefox part of a website I was desperately trying to use just stopped reacting for some functionality. Developer tools opened as a blank space. Importing a calendar entry in Evolution produced an inscrutable SQLite error. Starting Chromium (as backup browser in the hopes that the website would work better there) via Gnome did not open any window or show any error. It was only when I tried to start Chromium via the console that I saw a helpful error message for the first time.

Also I always start to mildly panic in such cases, as lots of software corrupts its on-disk state more when the hard drive is full than any segfault, OOM-kill or hard shutdown is able to. I can understand and empathize on how this happens from a software development perspective, but objectively speaking "our entire field is bad at what we do, and if you rely on us, everybody will die". ( https://xkcd.com/2030/ )

dfox4y ago

Userspace should not expect that any given syscall can only return some set of known errno values. You should enumerate the cases where you want to do some kind of special handling (with EINTR being somewhat more important that other cases) and have path to somehow handle even unexpected errno values.

Both Linux man pages and SUS specify some set of possible error situations, but not all of them. In the man pages case the set is not at all fixed and is subject to change and often does not contain some of the more obscure error states. The SUS "Errors" section are explicitly not meant to be complete and the OS can return additional errno values, additionally the OS can even handle some of the error cases as undefined behavior and not return any error code at all (notable example: doing anything to already joined pthread_t on linux, whish is undefined and does not return -ESRCH).

matheusmoreira4y ago

You're right. The manual contains this ominous notice at the very end of the errors section:

https://man7.org/linux/man-pages/man2/write.2.html

> Other errors may occur, depending on the object connected to fd.

I don't understand why every possible result isn't explicitly documented. This is the Linux system call interface, we need to know everything that could happen when we make these calls.

1 more reply

boloust4y ago

The bug is not that the program failed, it's that the program failed but reported a success.

KnobbleMcKnees4y ago

Failure isn't defined by programmed control structures, it's defined by requirements and implemented via programming.

If the requirements of a hello world program include accounting for all error boundaries of the host system, then I am yet to see them written down but would invite anyone to provide them.

The parent comment has made a start in this regard.

yawboakye4y ago

every program is given the 3 stdio channels: stdin, stdout, stderr. if the program is unable to use any of these as expected, it's an error that should be reported back to the user. today it's /dev/full but tomorrow it could be a log file with wrong access perms. you don't want to be returning 2 weeks later to find out that nothing was written, and your program didn't complain.

2 more replies

Cthulhu_4y ago

This particular category of bug is something people - those making up the requirements too - probably wouldn't even consider.

1 more reply

ptsneves4y ago

And this is the reason unit tests are very often insufficient and provide vanishingly small value. They test programming details, while integration tests test requirements implemented via programming. Loved your first paragraph.

dotancohen4y ago

To quote one of my favorite books:

  > [Hello, world] is the big hurdle. To leap over it you have to be able to
  > create the program text somewhere, compile it successfully, load it, run
  > it, and find out where your output went.

Those are the goals of "Hello, world!". Create the program, compile it, load it, run it, and find the output. Things that are not goals of "Hello, world!" are handling user input, reusable components (functions), network access, etc etc etc, error handling.

It's fine that the error is not handles, just as it is fine that the output went to stdout. Error handling was not a goal of the program.

tialaramex4y ago

> you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went.

Note that Rust "cheats" for you here, if you ask Cargo to make you a new Rust project then by default the project it gives you will perform Hello, World correctly when you "cargo run". It will also be version controlled (if Cargo can't figure out what type of version control you prefer, but git is installed, you get a git repo).

Rust's Hello World also of course panics if given a full output device. Because just ignoring errors by default, while very C, is not a good idea and in Rust it's much easier to respond to unexpected errors by just panicking rather than ignoring them.

hnlmorg4y ago

> Those are the goals of "Hello, world!". Create the program, compile it, load it, run it, and find the output.

(Emphasis mine)

How are you going to find the output if there is an error outputting it and you're not capturing that?

Given the requirements you've given, that would absolutely make error handling mandatory in my opinion.

1 more reply

oneeyedpigeon4y ago

I guess the argument is that the non-error-checking version fails at the "find out where your output went" stage. hello.c gives the impression that your output went to the file, even when it didn't.

Without a spec, I think it would be harsh to claim hello.c is wrong. But handling the error—in this case, returning it from main to the shell via an exit code—is definitely more correct.

nojs4y ago

I think they are arguing that it didn’t fail, it did everything you asked of it (it didn’t claim to successfully print hello world in every scenario, just to attempt to write to the buffer you gave it, which it did).

geocar4y ago

It seems possible they are arguing:

    _exit(write(1,"Hello World!\n",13));

is the correct one, but to me that just kicks the can. What should happen here?

    os.rename(x,y)
    print("success!")

should this exit nonzero? The file did get renamed and progress was made, even if some unrelated problem occurs, so some animation for a users benefit who is probably dealing with some other problems thinking piping to /dev/full was a good idea in the first place, well, it just seems almost cruel to further burden them with a surprising error code, so maybe I should wrap that print line in a pokemon since the output doesn't really matter that much anyway.

So it is I prefer to think of bugs as the difference between expectation and reality, and I think it should be fair to say different users can be predisposed to have different expectations; So I also I think it matters a great deal what the contract/expectations are.

But I also know the difference between /dev/full and /dev/null

1 more reply

bmacho4y ago

Well, if you want your hello world program to try to write Hello world, then report success regardless of the result, then it is bug-free. If you intend your program to write hello world on your screen/stdout, then it is definitely buggy.

The computer will do what you ask it to do, it's only a bug, when it doesn't meet your expectations.

pickledcods4y ago

The program failed to check the return value of the called function.

boloust4y ago

No, I think the argument was pretty clearly that "the program failed but it can't possibly be expected to work under every conceivable scenario".

usrbinbash4y ago

Whether the program fails or not, is a matter of specification.

    printf("Hello, World!\n")

Is me saying: "Do a write syscall to stdout. I don't care what the return value is, I don't care if the flush is successfull if stdout happens to be buffered." If that is what I want to do, aka. what the program is specified to do, then it didn't fail.

ryandrake4y ago

To me, your example says: "I forgot about the return value." The way I learned C, if you really, really want to say "I don't care what the return value is" you'd explicitly cast it, nicely documenting your active non-caring about the return value for future code readers:

    (void) printf("Hello, World!\n");

Although, in general, ignoring the return value from things like puts() and printf() is a bad idea, for reasons the article makes clear.

1 more reply

pdw4y ago

I found it interesting. If you generalize a bit, the question is "Will a naively written stdio program handle IO errors?". The fact that for several popular languages the answer is "no" is disappointing.

Beltiras4y ago

To me it was less disappointing and more intriguing.

Too4y ago

It’s not about handling the error, it’s about propagating unexpected error. Because most errors are that; unexpected.

Modern languages do this by default, using exceptions, or force you to check return values using Result<> or alike.

Even in C, when compiled through some more strict linter, this would fail because ignored return value should be prefixed with (void).

In either case I think the main takeaway from the article is that a language where even hello world has such pitfalls, isn’t suitable, given the many other better options today.

nebulous14y ago

My initial take was the same as yours. However, I would be of the opinion that the program would definitely be better if it returned non-zero on failure, so the question for me is whether it rises to the level of "bug" or not. In retrospect I can't think of when I wouldn't consider a program silently failing to not be a bug (unless specifically designed to silently fail), so I've come round to agreement with the article.

LadyCailin4y ago

Sounds like we need to use https://github.com/Hello-World-EE/Java-Hello-World-Enterpris... to cover all our bases.

yawboakye4y ago

stdio is program input, and a program's user should be informed about bad inputs. that said, usually, where hello world is usually demonstrated is far away from i/o so perhaps the negligence. but to argue that it's behaving correctly here is unnecessary.

usrbinbash4y ago

Enjoyable read for sure, but i think the question whether ot not this constitutes a bug or not is open for interpretation.

IMHO, it doesn't.

hello.c is written in a way that makes it very clear that the program doesn't care about error conditions in any way shape or form; the return value of printf is ignored, the output isn't flushed, the return of flushing isn't checked, noone looks at errno; ...so anything happening that could go wrong will go unreported, unless its something the OS can see (segfault, permission, etc.)

If I expect a program to do something (eg. handle IO errors) that its code says very clearly that it doesn't, that's not the programs fault.

stonemetal124y ago

>If I expect a program to do something (eg. handle IO errors) that its code says very clearly that it doesn't, that's not the programs fault.

Is there no such thing as a bug then? The program does what the code says so every "misbehavior" and crash is expected behavior.

usrbinbash4y ago

There is a difference between a program with a specification/documentation outlining what it should do, and one that doesn't have a spec.

AFAIK, hello.c doesn't have a spec, so the code is the spec. If I am using it, I have to read the code to know what it does.

admax88qqq4y ago

So if I don't write a spec, I have no bugs. Got it.

2 more replies

parksy4y ago

If we're going to reductio ad absurdum then OS's shouldn't let programs run if they interact with a device and don't explicitly handle exceptions for those devices. The OS is just a program and if it allows programs to crash it, isn't that the OS's fault?

oneeyedpigeon4y ago

This is a strange definition of 'buggy' to me. Surely it shouldn't depend on anything to do with the source code, otherwise closed-source programs are all 'neither-buggy-nor-not-buggy' and that can't be the case...

User234y ago

Most software is incapable of being incorrect, because correct behavior isn’t defined! Hence the questions here about who’s to say what’s correct. When the typical programmer says a program is “buggy” it means “it didn’t do what I want.” That’s a pleasantness property, not a correctness one.

myrrlyn4y ago

given that the sole purpose of software is to do what people want, "it didn't do what i want" is automatically incorrect behavior

1 more reply

rplnt4y ago

So the question boils down to: Is hello world a program that is supposed to write hello world or is it a program that is supposed to (compile and) start? For me it's usually the latter.

shikoba4y ago

None. Hello world is a program for beginners to teach them the most basic way to debug a program. The sole purpose of hello world is to explain how to print internal state of a program. In that case the examples are correct, they succeed.

ryandrake4y ago

I think a lot of people are getting too hung up on the fact that the application is specifically "hello world." The code is clearly buggy, even if you don't particularly care about bugs in this specific application. The author would have been better off not mentioning "hello world" as the application and simply said:

Find the bug!

    puts("This is a log");

I think everyone can agree that the above should be considered a bug in any kind of non-hello-world production code, for the reason the article mentioned.

1 more reply

bombcar4y ago

Exactly. When you’re done with hello world you’ve solved some major problems:

1. How to store the code in a file

2. How to find and use the compiler and linker

3. How to run compiled code

1 more reply

hombre_fatal4y ago

Of course, it’s also a way to show even experienced devs the most basic boilerplate to begin working from in something new.

What are the starting incantations and how do I run it.

l33t23284y ago

You shouldn’t have to read the code to understand what a program does, from an external POV at least.

usrbinbash4y ago

If the program comes with a specification then yes, ideally that should be the case.

If there isn't one however, then the code is all there is.

layer84y ago

For didactic reasons it’s preferable to consider it a bug.

usrbinbash4y ago

hello.c is meant as the first program a new student encounters when learning C. At that point, the student has enough to worry about; writing code to a file, checking for syntactic errors, basic program structure, using the compiler, executing the binary, understanding what `#include <stdio.h>` means,...

Sure, we could update it:

    // hello_v2.0.c
    #include <stdio.h>
    #include <string.h>
    #include <errno.h>

    int main(void) {
        printf("Hello, World!\n");
        fflush(stdout);
        if (errno != 0) {
            fprintf(stderr, "error: %s\n", strerror(errno));
            return errno;
        }
    }

But now we have different libraries, a multitude of external identifiers, control structures, blocks, return values, the concept of buffered streams, the concept of file-descriptors, the printf formatting language, program return values, boolean logic & conditionals,...

To someone who is already experienced in another language, that may not seem like a big deal, and isn't, but to someone who encounters the language for the first time, this is heavy stuff.

nextaccountic4y ago

This just demonstrates that C is an awful programming language for writing correct programs.

Compare this with Rust, where the usual hello world will just do the right thing:

    $ cat > a.rs
    fn main() {
        println!("Hello World");
    }
    $ rustc a.rs
    $ ./a            
    Hello World
    $ echo $?
    0
    $ ./a > /dev/full                                                                                                                                
    thread 'main' panicked at 'failed printing to stdout: No space left on device (os error 28)', library/std/src/io/stdio.rs:1187:9
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    $ echo $?                                                                                                                                     
    101

1 more reply

paavoova4y ago

Your usage of errno can cause failure on success if errno was set any time before the call to fflush. Might need a bump to v3 for robustness...

1 more reply

layer84y ago

This is much simpler:

  int n = printf("Hello, World!\n");
  return n < 0 ? EXIT_FAILURE : EXIT_SUCCESS;

1 more reply

pron4y ago

To those perplexed by the behaviour of Java's Hello World, as Java is otherwise very careful with error handling, this is because System.out is a java.io.PrintStream, and that's its documented behaviour:[1]

> Unlike other output streams, a PrintStream never throws an IOException; instead, exceptional situations merely set an internal flag that can be tested via the checkError method.

So the correct Hello World would be:

    System.out.println("Hello World!");
    if (System.out.checkError()) throw new IOException();

While the behaviour of PrintStream cannot be changed (it goes back to Java 1.0, and I'm guessing that the intention was not to require handling exceptions when writing messages to the standard output), adding a method to obtain the underlying, unwrapped OutputStream might be an idea worth considering, as it would allow writing to the standard output just like to any file.

[1]: https://docs.oracle.com/en/java/javase/17/docs/api/java.base...

barrkel4y ago

It's a behavioural side-effect of checked exceptions. Because IOException is a checked exception, throwing it for console output would cause a lot of pain for printf debugging.

pron4y ago

Well, Java has since introduced RuntimeIOException, which could be used in cases where an IO exception is unexpected, so we could introduce a new class, say, ErrorCheckingPrintStream, and add the method `ErrorCheckingPrintStream withErrorChecks()` to PrintStream if it's considered sufficiently worthwhile. So you could have:

    System.out.withErrorChecks().println("Hello World!);

But we can't change the behaviour of the existing PrintStream.

barrkel4y ago

I think you mean UncheckedIOException - https://docs.oracle.com/javase/8/docs/api/java/io/UncheckedI...

RuntimeIOException from Jira has an amusing (and incorrect, IMO) message:

https://docs.atlassian.com/software/jira/docs/api/7.6.1/com/...

    > An IOException was encountered and the stupid programmer didn't know 
    > how to recover, so this got thrown instead.

It's a misguided doc comment because "recovering" from error is usually the wrong thing to do - usually the right thing is to abort whatever action is taking place, whether it's a request handler, event loop or standalone program. Situations like low disk space, incorrect file permissions, missing files and so on usually can't be recovered deep in the stack or without manual intervention.

1 more reply

winstonewert4y ago

This interesting me because it shows that they had experience really early on in Java development that checked exceptions caused pain, and rather than realize that something was wrong with the design they started swallowing and ignore errors.

mccorrinall4y ago

Imo this is because the responsibility is not clearly defined and can be argued upon.

If my program writes to the standard output, but you choose to redirect the pipe to a different location, is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

After all: my program did output everything as expected. The part which fucked up was not part of my program.

I can see why some projects decide to not handle this bug.

usrbinbash4y ago

> but you choose to redirect the pipe to a different location

The output doesn't go into a pipe however, the output goes to /dev/full. Redirection happens before the process is started, so the program is, in fact, writing directly to a file descriptor that returns an error upon write.

CraigJPerry4y ago

In this scenario you didn’t write any bytes though. You made a call to write to standard out (your process’s file handle 1) and didn’t succeed, you didn’t handle the possible error condition, you just silently ignored it.

I think this is pretty cut and dried - the failure is inside your process’s address space and the programmer error is that you haven’t handled a reported error.

>> what happens to the bytes AFTER the pipe?

There isn’t a pipe involved here, when your process was created it’s stdout was connected to dev/full then your program began executing

usrbinbash4y ago

> you didn’t handle the possible error condition, you just silently ignored it.

Problem is, the error condition is not even that obvious. I tried it, and printf() will happily return the number of bytes written, even when redirecting stdout to /dev/full.

I am not 100% sure, but I think this has to do with the fact that printf uses buffered io, and writing the bytes to the buffer will work. It's only when the buffer is flushed that this will become a problem, but this would need to be handled in the code to show an error message.

hnlmorg4y ago

I don’t personally agree with that judgement. While the failure condition is at the OS level it’s still affecting the function of the program an in unexpected way.

Plus the whole point of STDOUT is that it is a file. So it shouldn’t change the developers mental model if that file happens to be a pseudo TTY, a pipe or a traditional persistent file system object. This flexibility is one of the core principles of UNIX and it’s what makes the POSIX command line as flexible as it is.

contradictioned4y ago

With that in mind, is this criticism of the Java hello world valid? Its output abstracts more than stdout and maybe in Windows this but would not occur. (I don't know, just discussing)

oneeyedpigeon4y ago

I feel like this is misrepresenting the article's point which isn't literally "hello world is buggy if it returns success on failure" but more "you should do error-handling". In this very specific case, you can argue that it's irrelevant. But if your program writes to a log file, or writes to a data file that it later reads from, it had better include some error-handling.

The fact that there's redirection is a ... misdirection. The redirection is only used to proxy a real-life case that can happen even when no redirection is taking place.

autoexec4y ago

What's the real-life case where hello world fails because the file system is full?

You could do all kinds of things that would cause hello world to "fail". A broken monitor (or even one unplugged) wouldn't show "hello world" or give any indication of an error too, but it's hardly the codes fault. The code does what it's supposed to and ignores all kinds of other things that could go horribly wrong. That's not really a bug, just a known and expected limitation of the program's scope.

oneeyedpigeon4y ago

As I said, not hello world necessarily, but any program that writes output can encounter this problem, and there's an overlap with general file-writing even.

ygra4y ago

I thought about this too a bit just now. But I think it's not the shell setting up stuff outside your process that then fails. Rather you already get handles to the "full" file system at process creation and then it's your problem. And traditionally, the behaviour you get from all the standard streams is very unpredictable, depending on where they point.

mirekrusin4y ago

Your program has a bug because it can write nothing or just part and will always return zero exit code. Ie think about using your program as part of bash script where you often rely on process exit codes.

dataflow4y ago

> is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

No, but it's not "after". Rather, it's your responsibility to handle backpressure by ensuring the bytes were written to the pipe successfully in the first place.

This isn't just about the filesystem being full btw. If you imagine a command like ./foo.py | head -n 10, it only makes sense for the 'head' command to close the pipe when it's done, and foo.py should be able to detect this and stop printing any more output. (This is especially important if you consider that foo.py might produce infinite lines of output, like the 'yes' program.)

I would argue this is not necessarily even an error from a user standpoint, so the return code from food.py should still be zero in many cases—a pipe-is-closed error just means the consumer simply didn't want the rest of the output, which is fine [1], whereas an out-of-disk-space error is probably really an error. Handling these robustly is actually difficult though, because (a) you'd need to figure out why printf() failed (so that you can treat different failures differently—but it's painful), and (b) you need to make sure any side effects in the program flow up to the printf() are semantically correct "prefixes" of the overall side effect, meaning that you'd need to pay careful attention to where you printf(). (Practically speaking, this makes it difficult to even have side effects that respect this, but that's an inherent problem/limitation of the pipeline model...)

FWIW, I would be very curious if anyone has formalized all of these nuances of the pipeline model and come up with a robust & systematic way to handle them. It seems like a complicated problem to me. To give just one example of a problem that I'm thinking of: should stderr and stdout behave the same way with respect to "pipe is closed"? e.g. should the program terminate if either is closed, or if both are closed? The answer is probably "it depends", but on what exactly? What if they're redirected externally? What if they're redirected internally? Is there a pattern you can follow to get it right most of the time? There's a lot of room for analysis of the issues that can come up, especially when you throw buffering/threading/etc. into the mix...

[1] Or maybe it isn't. Maybe the output (say, some archive format like ZIP) has a footer that needs to be read first, and it would be corrupt otherwise. Or maybe that's fine anyway, because the consumer should already understand you're outputting a ZIP, and it's on them if they want partial output. As always, "it depends". But I think a premature stdout closure is usually best treated as not-an-error.

hvdijk4y ago

> This isn't just about the filesystem being full btw. If you imagine a command like ./foo.py | head -n 10, it only makes sense for the 'head' command to close the pipe when it's done, and foo.py should be able to detect this and stop printing any more output.

The usual way of handling this is by not (explicitly) handling it. Writes to a closed pipe are special, they do not normally fail with a status that the program then all too often ignores, they result in a SIGPIPE signal that defaults to killing the process. Extra steps are needed to not kill the process. No other kind of write error gets this special treatment that I am aware of.

dataflow4y ago

That would be at best a Linux extension, not some general C behavior you can assume when writing your program.

That said though, I can't even reproduce what you're saying on Linux:

  printf '%s\n' '#include <stdio.h>' 'int main() { setvbuf(stdout, NULL, _IONBF, 0); int r = fputs("Starting\n", stdout); fflush(stdout); fprintf(stderr, "%d\n", r); }' | cc -x c - && ./a.out >&-
  // Prints '0' instead of dying

1 more reply

hgomersall4y ago

What the pipe does is orthogonal to what the programme should do. The problem here is that errors are not being handled. There are languages such as rust that enforce error handling, whereby the policy on error is made explicit. The nuances you highlight are around what the errors should describe, which ultimately leads to more potential granularity in the error policy.

masklinn4y ago

> If my program writes to the standard output, but you choose to redirect the pipe to a different location, is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

The pipe is your standard output. Your very program is created with the pipe as its stdout.

> After all: my program did output everything as expected. The part which fucked up was not part of my program.

But you are wrong, your program did not output everything as expected, and it failed to report that information.

bestouff4y ago

Well that's precisely the mindset of C/C++. You have to think by yourself about everything that can go wrong with your code. And, man, lots of things can go wrong.

I find more modern languages so much less exhausting to use to write correct code.

onion2k4y ago

I find more modern languages so much less exhausting to use to write correct code.

Modern languages do catch more programmer errors than C/C++, but the more general point is that there are "edge cases" (redirecting to a file isn't an edge case) that developers need to consider that aren't magically caught, and understanding the language you use well enough so as not to write those bugs is important.

The more experience I get as a dev the more I've come to understand that building the functionality required in a feature is actually a very small part of the job. The "happy path" where things go right is often trivial to code. The complexity and effort lies in making sure things don't break when the code is used in a way I didn't anticipate. Essentially experience means anticipating more ways things can go wrong. This article is a good example of that.

masklinn4y ago

> Modern languages do catch more programmer errors than C/C++, but the more general point is that there are "edge cases" (redirecting to a file isn't an edge case) that developers need to consider that aren't magically caught, and understanding the language you use well enough so as not to write those bugs is important.

But GP’s point is that modern languages can surface those issues and edge cases, and try to behave somewhat sensibly, but even sometimes “magically” report the edge cases in question.

That’s one of the things which is very enjoyable (though sometimes frustrating) in Rust, the APIs were (mostly) designed such that you must acknowledge all possible errors somehow, either handling it or explicitly suppressing it.

tialaramex4y ago

Indeed. One of the things you notice when writing say, Advent of Code solutions in Rust is that you're writing unwrap() a lot e.g. something like

let geese = usize::from_str_radix(line.strip_prefix("and also ").unwrap().strip_suffix(" geese.").unwrap(), 10).unwrap();

All these functions, usize::from_str_radix, str::strip_prefix, str::strip_suffix are Options which could be None, and we need to decide what to do with that option, Option::unwrap() says I promise this is Some(thing) and I want that thing. If you're wrong and the Option was None, Rust will panic.

Sure enough sometimes while rushing to solve an AoC problem you may get told your program panicked because you tried to unwrap None. Oops, I wrote "geese" above and thus didn't handle the scenario where it might say "goose" in the input when there was only one of them... need to handle that properly.

In a C program taking the same approach it's easy to skip that unwrap() step. After all we're sure it's fine... aren't we? And in the absence of any error reporting now all you know is AoC says the answer you got was wrong and you've got no idea what the problem is.

1 more reply

jll294y ago

Exactly. And isn't UNIX all about writing tools that can be used in unintended ways and combinations?

jcelerier4y ago

Considering that awk and TCL do not have the bug in question, while java, ruby, and node.js do, I'm not sure this can be framed in terms of modernity

josefx4y ago

I just realized I never thought about why System.out.println doesn't declare an IOException. Turns out PrintStream silently catches the exception and turns it into an error flag no one ever checks. Undermining both the Ability to handle IO errors using exceptions and making it impossible to find out what happened over what I assume was the ability to call System.out.println without checking for errors. Right now I am just happy that my IO code generally writes to binary streams so I don't have to rush through my code base to check for that nasty surprise.

pdw4y ago

Well, I'd say that the older languages mostly get it right, while the newer languages mostly fail.

ketzu4y ago

I am not sure either follows. But it depends how we even define "older languages", especially considering differences between python 3 and 2, are they the same age (based on the original python release) or are they treated for their respective release version?

Just taking some simple release dates [1] or wikipedia I found:

Ages of "Yes" group: 49, 36, 22, 26, 26, 12, 31

Ages of "No" group: 11, 14, 33, 7, 32, 45, 26, 34, 21

Averages: Yes 28.85, No 24.78

With the ambiguity around what "age" even means for the language here (e.g., counting the age of node.js or python) it is probably meaningless, but it seems well mixed independent of age.

[1] https://blog.sunfishcode.online/bugs-in-hello-world/

1 more reply

jollybean4y ago

Yes. First you learn to 'code stuff' and as your experience progresses in that language you merely learn more and more of the ways it can be wrong, and then you worry more, and have to overthink every little thing.

The hidden costs are enormous and to this day still not very well accounted for.

Sohcahtoa824y ago

C/C++ basically do only exactly what you tell them and nothing more, which is why they're so much faster than other languages.

There's no garbage collection/reference counting/etc. going on in the background. Objects aren't going to be moved around unless you explicitly move them around (Enjoy your heap fragmentation!). In C, you don't even get exceptions.

Of course, this creates TONS of foot-guns. Buffer overflows, unchecked errors, memory leaks, etc. A modern language won't have these, except for memory leaks, but they're much less likely to happen in trivial to moderate complexity apps.

kazinator4y ago

In this case, I don't see how they help.

A modern language could automatically throw an exception if the string cannot be completely written to standard output.

But that has not necessarily helped. The program now has a surprising hidden behavior; it has a way of terminating with a failed status that is not immediately obvious.

If it is used in a script, that could bite someone.

In Unix, there is such an exception mechanism for disconnected pipes: the SIGPIPE error. That can be a nuisance and gets disabled in some programs.

shikoba4y ago

In C, yes. But for C++ you can use exceptions if you want. You just have to "protect" all the "primitives" and after that you're safe.

em3rgent0rdr4y ago

And the mindset can be to treat such conditions as "Don't Care" to allow the C/C++ code to generate more optimized assembly.

josefx4y ago

You could have a way to explicitly flag that, instead C will assume that any accidentally silently discarded result can be optimized around with and kick of a landslide of changes that turns bugs ten times worse.

hiccuphippo4y ago

This is one of the reasons zig doesn't have a print function and you have to deal with the error when using the stdout writer[0].

[0] https://zig.news/kristoff/where-is-print-in-zig-57e9

abainbridge4y ago

Came here to say the same. Zig gets it right. https://ziglang.org/documentation/master/#Hello-World

avar4y ago

A real world example of catching (some, but certainly not all) fflush(), ferror() etc. cases is what "git" does at the end of its execution, the first highlighted line is where it's returning from whatever function implements a built-in ("status", "pull", "log" etc. etc.): :https://github.com/git/git/blob/v2.35.0/git.c#L464-L483

Doing something similar would be a good addition to any non-trivial C program that emits output on stdout and stderr.

In practice I haven't really seen a reason to exhaustively check every write to stdout/stderr as long as standard IO is used, and fflush() etc. is checked.

A much more common pitfall is when dealing with file I/O and forgetting to check the return value of close(). In my experience it's the most common case where code that tries to get it wrong actually gets it wrong, I've even seen code that checked the return value of open(), write() and fsync(), but forgot about the return value of close() before that fsync(). A close() will fail e.g. if the disk is full.

Anthony-G4y ago

I work as a sysadmin and only write the odd program/script (Python, Perl, Bash). In the past, I’ve run into the problem of not being able to write to a log file (disk full or insufficient permissions) so I now check for these situations when opening or writing to a file.

A while ago, I started learning C in my personal time and am curious about this issue. If `close()` fails, I’m guessing there’s not much else the program can do – other than print a message to inform the user (as in the highlighted git code). Also, I would have thought that calling `fsync()` on a file descriptor would also return an error status if the filesystem/block device is full.

avar4y ago

For both close() and fsync() it depends on how they fail. You should generally call them in a loop and retry as long as they're returning an error that's EINTR. I.e. to retry past signal interruptions.

This is really more about POSIX and FS semantics than C (although ultimately you end up using the C ABI or kernel system calls, which are closer to C than e.g. Python).

POSIX gives implementations enough leeway to have close() and fsync() do pretty much whatever they want as far as who returns what error goes, as long as not returning an error means your data made it to storage.

But in practice close() is typically 1=1 mapped to the file itself, while fsync() is many=1 (even though both take a "fd"). I.e. many implementations (including the common consumer OS's like Windows, OSX & Linux) have some notion of unrelated outstanding I/O calls being "flushed" by the first process to call fsync().

IIRC on ext3 fsync() was pretty much equivalent to sync(), i.e. it would sync all outstanding I/O writes. I believe that at least Windows and OSX have a notion of doing something similar, but for all outstanding writes to a "leaf" on the filesystem, i.e. an fsync() to a file in a directory will sync all outstanding I/O in that directory implicitly.

Of course none of that is anything you can rely on under POSIX, where you not only have to fsync() each and every file you write, but must not forget to also flush the relevant directory metadata too.

All of which is to say that you might be out of space when close() happens, but by the time you'd fsync() you may no longer be out of space, consider a write filling up the disk and something that frees up data on disk happening concurrently.

If you know your OS and FS semantics you can often get huge speedups by leaning into more lazily syncing data to disk, which depending on your program may be safe, e.g. you write 100 files, fsync() the last one, and know the OS/FS syncs the other 99 implicitly.

But none of that is portable, and you might start losing data on another OS or FS. The only thing that's portable is exhaustively checking errors after every system call, and acting appropriately.

Anthony-G4y ago

Thanks for the response. It’s always useful to know what happens (and should happen) at a lower level with system calls – both from a POSIX perspective and how they are implemented in popular operating systems.

yesenadam4y ago

I couldn't see GNU Hello mentioned in the article or comments so far. I wonder how it fares bug-wise.

The GNU Hello program produces a familiar, friendly greeting. Yes, this is another implementation of the classic program that prints “Hello, world!” when you run it.

However, unlike the minimal version often seen, GNU Hello processes its argument list to modify its behavior, supports greetings in many languages, and so on. The primary purpose of GNU Hello is to demonstrate how to write other programs that do these things; it serves as a model for GNU coding standards and GNU maintainer practices.

https://www.gnu.org/software/hello/

mcbrit4y ago

This is explicitly called out and handled in lines 151-155.

https://git.savannah.gnu.org/cgit/hello.git/tree/src/hello.c

Here's the comment:

  /* Even exiting has subtleties.  On exit, if any writes failed, change
     the exit status.  The /dev/full device on GNU/Linux can be used for
     testing; for instance, hello >/dev/full should exit unsuccessfully.
     This is implemented in the Gnulib module "closeout".  */

pixelbeat__4y ago

Jim Meyering's discussion on how this is handled usually in GNU programs

https://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodb...

yesenadam4y ago

Thank you.

tgv4y ago

It's a fun take, but a hyperbole nonetheless. hello.c is supposed to be run from a terminal and write back to it: there's always space to write. It's not meant to be part of a shell script, so the error status is irrelevant.

It does show that we take such examples a bit too literally: our feeble minds don't consider what's missing, until it's too late. That's a didactic problem. It only matters to certain kinds of software, and when we teach many people to program, most of them won't go beyond a few small programs. But perhaps the "second programming course" should focus a bit less on OOP and introduce error handling.

hnlmorg4y ago

It depends on whether you want your Hello World programs to reflect an actual program or just be an approximation.

I’d argue there is little benefit in the latter. Particularly these days where the Hello World of most imperative languages look vaguely similar. Maybe back when LISP, FORTRAN and ALGOL were common it was more useful showing a representation of the kind of syntax one should expect. But that isn’t the case any more.

Plus given the risk of bugs becoming production issues or, worse, security vulnerabilities and the ease and prevalence of which developers now copy and paste code, I think there is now a greater responsibility for examples to make fewer assumptions. Even if that example is just Hello World.

dmurray4y ago

> It depends on whether you want your Hello World programs to reflect an actual program or just be an approximation. I’d argue there is little benefit in the latter.

There's a huge benefit in having a program that verifies you have set up the programming environment successfully and can build and execute your programs. Far more than the didactic benefit of any "Hello World" program.

Handling terminal output is just an extra nice-to-have at that point, and one convenient way to verify your tools are working. Correct error handling is definitely out of scope.

hnlmorg4y ago

Interesting take but I see two problems with that:

1. if you're testing your development environment then handling errors appropriately is even more important. The last thing you want to find out is that your development environment doesn't work because of some edge case that wasn't tested.

2. if your code is just to test the development environment then ship that test code with the development environment rather than publish it on your home page as a practical example of your languages code.

What you're describing is effectively a behavioral test, not a Hello World example.

knorker4y ago

> a hyperbole

It's not, though.

This helloworld is not safe to use as part of something bigger. Like:

    echo header > gen.txt && ./helloworld >> gen.txt && ./upload_to_prod gen.txt

That will upload a partial file to prod, if there's any write error.

> It's not meant to be part of a shell script

You don't know that. And brittle pieces like this is absolutely not an uncommon source of bugs.

tgv4y ago

> You don't know that

The first piece of C code in introduction section was meant as production software? I've checked it: that section mentions typing "a.out" in the the UNIX shell to see what happens.

knorker4y ago

Sorry, I'm not following. Introduction section of what?

1 more reply

moring4y ago

If you don't know how the program is going to be used, how do you know that it is doing the right thing? Is "Hello world" actually the expected output?

Also, what makes the status code handling special compared to, say, - assuming the english language is the preferred language instead of asking the OS for the user's preference - assuming that the output has to be ASCII (or compatible) instead of something like UTF-16

There seems to be a weird obsession with the program's status code over anything else in this whole comment section, and it seems to me that the only reason for that is that back in the stone age of computing, the status code was the only thing that got standardized, while locale and encoding didn't, so properly supporting the latter is hard and therefore assumed to be less important.

knorker4y ago

> Is "Hello world" actually the expected output?

Isn't that the definition of helloworld?

> Also, what makes the status code handling special compared to, say, - assuming the english language is the preferred language instead of asking the OS for the user's preference - assuming that the output has to be ASCII (or compatible) instead of something like UTF-16

True. But by any definition printing nothing is a failure. If failing to print anything isn't a failure, what is?

> There seems to be a weird obsession with the program's status code over anything else in this whole comment section

It's because it's the only structured way to indicate success. And "not printing" is clearly a failure of "print 'hello world'".

> supporting the latter is hard and therefore assumed to be less important.

No, but printing nothing is clearly a failure. Printing the wrong language is not obviously a failure of helloworld, and absolutely not something helloworld can know on its own.

helloworld can know it failed to print what it intended to print. It cannot confirm that its intentions are correct, even in principle.

It cannot know whether some Lennartware has decided that anything built before yesterday (e.g. LANG envs and friends) can be ignored, and that in Lennart land all programs should write a request for the language to the blockchain, and wait for a reply transaction before printing anything.

Just because you can come up with examples of errors helloworld cannot check, doesn't mean that it should not do error handling for the things that it can.

`tar` cannot check that the user actually intended the particular file format flavor that it implements, but it can know that failing to write it should cause it to report an error.

Reventlov4y ago

And to catch such a thing in C, if anyone was wondering, you would have to fflush stdout.

sunfish4y ago

This is true, however if we modify the program to print a 4096-byte long string instead of just the "hello world" string, then it's not sufficient again. And of course, the number 4096 is system-dependent.

So to really do hello world in C right, in addition to fflush, you also need to check the return value from puts. I've never seen any C tutorial do that though.

wahern4y ago

Because errors on FILE streams are persisted until clearerr, it should be sufficient to check the return value of fflush at the end of the program. Presumably the FILE I/O interface was deliberately designed this way, so error checking could be consolidated at the end of a series of operations.

moring4y ago

What would be the expected reaction if either puts or fflush returns an error code? You might think, write a message on stderr (which may be different from stdout), but what if stderr is also redirected to a full device? How would you react to the error code returned from that?

To me this is an indication that you need to know the context in which the program gets run, and its purpose in that context. Or you'd have to specify every edge case, but I've never seen that really work in practice.

throwawaylinux4y ago

> What would be the expected reaction if either puts or fflush returns an error code? You might think, write a message on stderr (which may be different from stdout), but what if stderr is also redirected to a full device? How would you react to the error code returned from that?

I don't think you would react any differently on stderr failure unless you had a more complex system with alternate ways to report such things configured.

Just ignore it and continue to report the major error (stdout) in your exit status.

jll294y ago

#include <stdio.h>

int main(void) { return !(puts("Hello, new world!") == EOF); /* return 0 unless a rare I/O error occurs */ }

nemetroid4y ago

First off, main should return 0 on success, not 1. Second, puts() will happily return 0 when writing to /dev/full.

1 more reply

shultays4y ago

Should "hello world" return error if it actually prints something but there was no person to read the output? Maybe the user was distracted and was not looking at the screen. Does a "hello world" program make sound if no one hears it?

Sounds like the program failed its objective, greeting the world. And thus imho shouldn't return 0.

masklinn4y ago

The program’s objective was to output the information, not to ensure that a user read it. Redirecting to `/dev/null` is a valid and common use of a program warranting no warning, so is running a program, collecting its log, then ultimately discarding it having never looked at it (in fact it’s the norm of well-behaved and solid programs).

dwohnitmok4y ago

This raises an interesting question: is there any IO function that should return unit/void? Or equivalently are there any IO functions for which we can safely ignore the return value/ignore all exceptions?

It seems like every single IO thing I can think of can have a relevant error, regardless of whether it's file-system related, network, or anything else.

andreyv4y ago

In C, and many other languages, the file stream error state is saved after each operation, so you can skip error checking on every output line and only do

  if (fflush(stdout) != 0 || ferror(stdout) != 0)
  {
    perror("stdout");
    return EXIT_FAILURE;
  }

at the end of the program. The same should be done for stderr as well.

In GNU programs you can use atexit(close_stdout) to do this automatically.

Asooka4y ago

I wish this post were higher up, since it shows the idiomatic way to deal with that problem, unlike the article. Obviously the designers of the Unix i/o interface thought about this and provided for a simple way of handling it.

PennRobotics4y ago

Would perror() return the first/oldest error or the last?

andreyv4y ago

Right — ferror() does not set errno, and so perror() is not appropriate here. fprintf(stderr, ...) would be better.

dataflow4y ago

I think you can certainly return void, and you can ignore any I/O exceptions up to the top layer of the stack, but then you have to decide whether the exception should result in an error code to the user or not. Some (like "out of disk space") are usually errors, whereas others (like "no more data" or "pipe is closed") may not be.

joosters4y ago

Since the article is being pedantic, here's another pedantic complaint: What if printf() can't write all of its output, but manages to write some of it? printf() returns the number of bytes written, but (and I'm sure someone will correct me if I'm wrong!) it doesn't guarantee to be atomic - it can't either write everything or nothing. Imagine a complicated printf() call with lots of parameters and long strings - some of it might get written, then the next write() that it does fails due to lack of space. What does printf() do then?

The article cites an example of writing a YAML file and the dangers of it being half-written. Well, you could imagine outputting a file all in one printf() with lots of %s's in the format string. Some get written, but not all. If printf() decides to return an error message, retrying the printf() later on (after deleting another file, say), will corrupt the data because you'll be duplicating some of the output. But if printf() just returned the number of bytes written, your program will silently miss the error.

So does 'Hello World\n' need to check that printf() succeeded, or does it actually need to go further and check that printf() returned 12? (or is it 13, for \r\n ?) I don't think there's any way to really safely use the function in real life.

shikoba4y ago

If printf can write some bytes but not all of them. The C documentation is explicit:

> a negative value if an output error occurred

So in your case that's an error and printf returns a negative value. But yes, how many bytes were written is a lost information.

enriquto4y ago

> So does 'Hello World\n' need to check that printf() succeeded, or does it actually need to go further and check that printf() returned 12?

No. According to fprintf(1), when the call succeeds it returns the number of printed characters. If it fails (for example, if it could only print part of the string) then it returns a negative value.

The number of printed characters is useful to know how much space was used on the output file, not to check for success. Success is indicated by a non-negative return value.

kazinator4y ago

> There's our "No space" error getting reported by the OS, but no matter, the program silently swallows it and returns 0, the code for success. That's a bug!

Bzzt, no. You can't say that without knowing what the program's requirements are.

Blindly "fixing" a program to indicate failure due to not being able to write to standard output could break something.

Maybe the output is just a diagnostic that's not important, but some other program will reacts to the failed status, causing an issue.

Also, if a program produces output with a well-defined syntax, then the termination status may be superfluous; the truncation of the output can be detected by virtue of that syntax being incomplete.

E.g. JSON hello world fragment:

   puts("{\"hello\":\"world\"}");
   return 0;

if something is picking up the output and parsing it as JSON, it can deduce from a failed parse that the program didn't complete, rather than going by termination status.

jjnoakes4y ago

> if something is picking up the output and parsing it as JSON, it can deduce from a failed parse that the program didn't complete, rather than going by termination status.

This is bad advice. Consider output that might be truncated but can't be detected (mentioned in the article).

The exit status is the only reliable way to detect failures (unless you have a separate communication channel and send a final success message).

kazinator4y ago

My remark "if a program produces output with a well-defined syntax" was intended specifically to consider such cases, and set them aside.

I didn't communicate that clearly: syntax can be "well-defined" yet truncatable. I meant some kind of syntax that is invalid if any suffix is missing, including the entire message, or else an object of an unexpected type is produced.

(In the case of JSON, valid JSON could be output which is truncatable, like 3.14 versus 3.14159. If the output is documented and expected to be a dictionary, we declare failure if a number emerges.)

Too4y ago

When dealing with errors of integrating 100 different programs in a script, I don’t want to set aside special cases.

It should always behave the same. The exit code of a program is the agreed upon standard for this.

MauranKilom4y ago

> Also, if a program produces output with a well-defined syntax, then the termination status may be superfluous; the truncation of the output can be detected by virtue of that syntax being incomplete.

The author covers this (or rather, the possibility that truncation can not be detected).

kazinator4y ago

There is more nuance to this, which is that we cannot detect all modes of failure just because we have written to a file object, and successfully flushed and closed it.

In the case of file I/O, we do not know that the bits have actually gone to the storage device. A military-grade hello world has to perform a fsync. I think that also requires the right storage hardware to be entirely reliable.

If stdout happens to be a TCP socket, then all we know from a successful flush and close is that the data has gone into the network stack, not that the other side has received it. We need an end-to-end application level ack. (Even just a two-way orderly shutdown: after writing hello, half-close the socket. Then read from it until EOF. If the read fails, the connection was broken and it cannot be assumed that the hello had been received.)

This issue is just a facet of a more general problem: if the goal of the hello world program is to communicate its message to some destination, the only way to be sure is to obtain an acknowledgement from that destination: communication must be validated end-to-end, in other words. If you rely on any success signal of an intermediate agent, you don't have end-to-end validation of success.

The super-robust requirements for hello world therefore call for a protocol: something like this:

    puts("Hello, world!");
    puts("message received OK? [y/n]")
    return (fgets(buffer, sizeof buffer, stdin) != NULL && buffer[0] == 'y')
            ? EXIT_SUCCESS : EXIT_FAILURE;

Now we can detect failures like that there is no user present at the console who is reading the message. Or that their monitor isn't working so the can't read the question.

We can now correctly detect this case of not being able to deliver hello, world, converting it to a failed status:

  $ ./hello < /dev/null > /dev/null

We can still be lied to, but there is strong justification in regarding that as not our problem:

  $ yes | ./hello > /dev/null

We cannot get away from requiring syntax, because the presence of a protocol gives rise to it; the destination has to be able to tell somehow when it has received all of the data, so it can acknowledge it.

A super reliable hello world also must not take data integrity for granted; the message should include some kind of checksum to reduce the likelihood of corrupt communication going undetected.

ComradePhil4y ago

> You can't say that without knowing what the program's requirements are.

The "program's requirements" can in theory be "to be buggy unusable piece of shit". But when we speak, we don't need to consider that use case.

Ericson23144y ago

This is good. All the people questioning the spec need to realize is handling the error should be opt-out, not opt-in.

#[must_use] in Rust is the right idea: Rust doesn't automatically do anything --- there is no policy foisted upon the programmer --- but it will reliably force the programmer to do something about the error explicitly.

pixelbeat__4y ago

Jim Meyering's classic "Goodbye World" talk on this

https://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodb...

andi9994y ago

It would be more interesting if the post shows how to detect that error. (and how the other language examples look, at least on mobile I dont see them)

moltenguardian4y ago

Even Golang de facto suffers from this. I don't think I can name a time I saw someone check the return value of fmt.Print or log.Print. Not checking the return value still seems the the "right" thing to do.

silisili4y ago

This seems to be checking return values, which is a very unixy thing to do.

Most Go in the wild is doing way more than a typical *nix binary, so the use case differs.

If you want a resilient system, you don't die on print and log failures.

rini174y ago

Usually the result of ignoring "disk full" or partial writes is not resilience but byzantine failure instead.

Thaxll4y ago

Because fmt and log print are not usually not used for this, if I were to do it properly I would use io.Copy() which everyone check the result.

Checking the result of log and print is very tedious and not useful most of the time.

hnlmorg4y ago

I do. But then I’m writing a shell (like Bash/Fish/etc but more DevOps focused) so if I don’t handle all types of errors then the entire UX falls apart.

Tobu4y ago

Indeed, first result and no error checking:

https://gobyexample.com/hello-world

ale424y ago

I was expecting Free Pascal not to have the bug, as Pascal generally fails with a visible runtime error, as it does I/O checking by default. However, it seems not to do it when WriteLn goes to the standard output... (even if it is then piped to /dev/full). So the basic

    begin
    WriteLn('Hello World!');
    end.

definitely has the bug, at least with the fpc implementation. On the other hand, explicitly trying to write to /dev/full from the Pascal source triggers a beautiful message:

    Runtime error 101 at $0000000000401104

pickledcods4y ago

Do not forget the notes on dup2(). It's about the automatic closing of newfd before it gets replaced. I've bumped into this situation several times, that is why I'm mentioning it.

  SYNOPSIS
   int dup2(int oldfd, int newfd);

  NOTES:
  If newfd was open, any errors that would have been reported at close(2) time are lost.
  If this is of concern, then the correct approach is not to close newfd before calling dup2(),
  because of the race condition described above.
  Instead, code something like the following could be used:

   /* Obtain a duplicate of 'newfd' that can subsequently
      be used to check for close() errors; an EBADF error
      means that 'newfd' was not open. */

   tmpfd = dup(newfd);
   if (tmpfd == -1 && errno != EBADF) {
       /* Handle unexpected dup() error */
   }

   /* Atomically duplicate 'oldfd' on 'newfd' */

   if (dup2(oldfd, newfd) == -1) {
       /* Handle dup2() error */
   }

   /* Now check for close() errors on the file originally
      referred to by 'newfd' */

    if (tmpfd != -1) {
       if (close(tmpfd) == -1) {
           /* Handle errors from close */
       }
   }

bandrami4y ago

This is interesting because it's not clear "who owns" that error, the program itself or the shell that sets up the redirection.

heleninboodler4y ago

I think it's clearly main() that "owns" that error, since it's the one that swallowed it. It would be impossible for the shell to own it since it's impossible for the shell to even detect it, given this program's buggy behavior.

I find the argument that the code obviously ignores the error so that's obviously the program's intent to be completely spurious. The code "obviously" intends to print the string, too, and yet in some cases, it doesn't actually do that. It's clearly a bug. I don't think it's particularly useful to harp on this bug in the most introductory program ever, but it's definitely a bug.

b-zee4y ago

Definitely thought-provoking. A few responses here on HN disagree with calling this a bug, so maybe the user owns the error. This is all related to what kind of contract we have in mind when creating and using such a program.

If `puts` were to be used for debug messages, it might be right not to fail so as to not disturb the rest of the program. If the primary purpose is to greet the world, then we might expect it to signal the failure. But each creator or user might have their own expected behaviors.

If a user expects different behavior, then perhaps it is a feature request:

> There's no difference between a bug and a feature request from the user's perspective. (https://blog.codinghorror.com/thats-not-a-bug-its-a-feature-...)

The question is how the behavior can be made more explicit. I think it's a reasonable default to make programs fail often and early. If some failure can be safely ignored, it can always be implemented as an (explicit) feature.

paradite4y ago

Since we are going pedantic here, here are 3 bugs that I found in the blogpost:

1. Node.js result is out-dated. I run on Node.js v14.15.1 hello world code below on macOS and it reported exit code 1 correctly:

  // testlog.js
  console.log('hello world')
  process.exit(0)

  // bash
  $ node -v
  v14.15.1
  $ node testlog.js > /dev/full
  -bash: /dev/full: Operation not permitted
  $ echo $?
  1

2. Node.js is not a language. JavaScript is a language, and Node.js is a JavaScript runtime environment that runs on the V8 engine and executes JavaScript code outside a web browser.

3. Missing JavaScript result in the table, which is the most popular language on GitHub: https://octoverse.github.com/#top-languages-over-the-years

0x04y ago

Since macOS does not have /dev/full, I think what is actually happening here is your bash shell fails to create a file named "full" in "/dev" and so the bash shell exits with an error; this has nothing to do with node.js.

gwd4y ago

Yeah -- that's bash that's reporting the error; it looks like node never actually gets run.

paradite4y ago

Yeah you are probably right.

Now I'm curious about another interesting question. Should bash be the one that handle the error and exit code in this case? Since it seems to be responsible for handing the piping operation.

moonchild4y ago

If by 'in this case' you mean the problem mentioned by the linked article, then no. The shell is not 'responsible for handling the piping operation'. It creates the pipe, but is not responsible for moving data through it.

1 more reply

dmurray4y ago

> 2. Node.js is not a language. JavaScript is a language,

This criticism is the wrong way around. All of the author's "languages" are actually language implementations like NodeJS. You can tell because he produced the results by running the code, rather than by reading a spec.

paradite4y ago

Yes you are right about the differences.

So what I'm proposing is to put JavaScript in the language column (like other languages such as Java) and note the usage of Node.js as the implementation in the second column together with version (similar to Java -> openjdk 11.0.11 2021-04-20).

Would that make sense?

dmurray4y ago

That would certainly make sense, but the author hasn't followed that pattern consistently so far - e.g. the row for C doesn't list the compiler or architecture.

ksbrooksjr4y ago

On node 16.9 console.log doesn't produce a non-zero exit code, but process.stdout.write does, and gives me a decent error message as well:

  internal/fs/utils.js:332
      throw err;
      ^
  Error: ENOSPC: no space left on device, write
      at writeSync (fs.js:736:3)

terinjokes4y ago

This is because console.log isn't the equivalent to the post's printf. It is purposefully opaque to the application (and applications should not assume anything happens with the input).[0]

> Its main output is the implementation-defined side effect of printing the result to the console.

[0]: https://console.spec.whatwg.org/#logger

jwilk4y ago

It wouldn't make sense to include JavaScript, because it doesn't have the concept of stdout.

sundarurfriend4y ago

Julia (1.7) behaves pretty similar to the Python 2 one, a printed error related to closing the stream, and a 0 error code.

    ~ >>> julia -e 'print("Hello world")' > /dev/full
    error in running finalizer: Base.SystemError(prefix="close", errnum=28, extrainfo=nothing)
    #systemerror#69 at ./error.jl:174
    systemerror##kw at ./error.jl:174
    systemerror##kw at ./error.jl:174
    #systemerror#68 at ./error.jl:173 [inlined]
    systemerror at ./error.jl:173 [inlined]
    close at ./iostream.jl:63
    ⋮

    ~ >>> echo $?
    0

`errnum=28` apparently refers to the ENOSPC error: "No space left on device" as defined by POSIX.1; so the information is there, even if not in the most presentable form.

albertzeyer4y ago

In the end, it states that the language C has the bug. But this is wrong. In C, there are no exceptions, i.e. all error checking has to be explicit. This is just the language. So when you now ignore the error, this is not a bug of the language but just a bug in your code. The only thing you could argue is that this is a bad language design.

Or maybe this about global stdout object. With buffering enabled (by default), printf will not throw any error. The fflush would do. But a final fflush would be done implicitly at the end. But this is again all well documented, so still, this is not really a bug but maybe just bad language design.

I'm not exactly sure what C++ code was used. If this was just the same C code, then the same thing applies. And iostream just behaves exactly as documented.

andai4y ago

An interesting link in the article: "Main is usually a function. So then when is it not?"

https://news.ycombinator.com/item?id=27504254

pretzelhands4y ago

Seems like PHP does something reasonably right for once!

    $ php hello.php > /dev/full
    $ echo $?
    255

It doesn't exactly print an error, but at least it returns something non-zero.

oneeyedpigeon4y ago

Which version are you using? Because mine gives a zero return code. I'm running 7.0.33.

Sporktacular4y ago

Still confused. It seems some people think there is nothing to fix, some think the programmer needs to act to prevent it, some think ANSI and other creators of the affected languages would need to act to prevent it.

If we accept the idea that the function (non-coding use of the word) of an language's indication of success should - indicate success (or its absence) - of a piece of code, then surely the creators of the languages should make it do just that. That's their job right no? What am I missing?

enriquto4y ago

Ignoring the return of printf is a "bug". For the hello-world example, you can simply pass the printf value to main: "return printf(...) > 0;"

dataflow4y ago

It's not necessarily an error to print less than you intended though. The consumer might have simply decided that they didn't need the rest of the input. Whether or not it's an error depends on why the write failed to occur. Usually out-of-space is an error, whereas pipe-is-closed/has-reached-EOF is not.

jwilk4y ago

That's insufficient, because printf() is buffered.

parker784y ago

It's only a bug if the requirements are:

"Print Hello World and indicate if it succeed or not"

If the requirements were:

"Print Hello World, then return 0"

It's working as intended.

I'd even go so far as to say that print(); return 0; should always return 0, it would be weird for such a program to ever return anything other than 0 (where would that return come from?).

hgomersall4y ago

In your pseudocode, "Print Hello World" doesn't come with any caveats, like "unless there is an error, in which case silently don't print Hello World". If an error might occur, your description is incomplete if you don't describe the policy that should be taken.

Your second point might be fine, except that it doesn't describe the API that languages actually use to print. For sure, it's trivial to implement the policy you describe, but suggesting that everyone always needs that policy is rather limiting and makes light of the real bugs that failure to handle errors actually results in.

pjerem4y ago

The requirement of a "Hello World" program is always, by nature, to print "Hello World".

If my program calls your Hello World program, it expects it to print Hello World. That's basically the point of the program.

If your program don't print Hello World for whatever reason, of course you don't need to manage the error if it wasn't specified. But it's probably a bad thing (call it a bug or not) to exit 0 which the caller will interpret by "Hello World have just been printed successfully", I can go on and print ", John".

I agree it's probably not going to be in the requirements, and world will probably not collapse if you don't manage the error, but it's with no doubt an idiom required by most OSes to ensure programs are normally working.

You can also create orphan processes if it's needed by your requirements, but it's probably a bug or a hole in your requirements. Because at some point, non idiomatic programs will be used in situations where they will be creating issues. And we are talking about issues that are very hard to even spot.

Those "non requirements" are exactly how you lately discover that you have no logs from the last two weeks or that your backups aren't complete.

It's not requirement, but it's just hygiene.

tbf, I'm arguing of what should be an idea world, but I probably have myself written those sorts of bugs. Writing idiomatic code is hard and no one is to blame for not doing it perfectly. I just think it's some ideal to aim for.

cowl4y ago

But you are forgetting where to print. The classic "Hello World" program requires you to Print to a terminal not to a file. The fact that the terminal and a file can be used interchangeably in *nix system is not the responsibility of the program. Likewise, Printing "hello world" to my 3d printer is not the purpose either. My "hello world" program is meant to Assure the minimum possible feedback that the toolchain is working as expected and being called by your program is not a part of that purpose.

usrbinbash4y ago

> That's basically the point of the program.

The point of hello.c is to serve as demonstration to students of the language of what a very basic program looks like, and how to use the toolchain to get it to run.

That's it, that's the requirements specified.

dmurray4y ago

The interesting case for me is if the requirements were "print Hello World". I'd argue that the one with an explicit return value is incorrect in that case, because the extra line of code leads you to believe an extra requirement exists which is to indicate success.

prewett4y ago

/dev/full is brilliant, I need to remember that!

On the other hand, it's kind of depressing that I can't even write to stdout without needing to check for errors. And what are you going to do if that fails? Write to stderr? What if that fails because the program was run with `2>&1` ?

bor04y ago

It's a good post, but heavily OS dependant. For example, on my Mac:

  $ ls /dev/null /dev/full
  ls: /dev/full: No such file or directory
  /dev/null

I guess in theory, you can imitate `/dev/full` by other means.

cyborgx74y ago

>Linux has this fun device file called "/dev/full"

Yes it is. And it specifies the OS as well.

not2b4y ago

NetBSD and FreeBSD have /dev/full as well.

pjerem4y ago

And all OS can have disks that are full as well.

PennRobotics4y ago

I can't test this and the original blog post seems to be missing, but someone has a Github gist where they create a full ramdisk on OSX:

https://gist.github.com/koral--/12a6cdda22ffbd82f28ecc93e0b5...

oneeyedpigeon4y ago

Yup — I found this worked well: https://www.thedroidsonroids.com/blog/dev-full-osx

hwinked4y ago

I don’t think this is a bug. The program writes to stdout which is guaranteed by *ix to be there. If it is full, it would block until the kernel serviced it. In your examples, the user asked the shell to redirect to a full file and it reported the error.

skolskoly4y ago

Doesn't seem like anyone has posted the bug free implementation so here it is:

  int main(void)
  {
      char the_terminal[] = "Hello World!\n"
      return 0;
  }

lifeisstillgood4y ago

This is (IMHO) the difference between unit testing and other testing.

Unit testing verifies it does what it is supposed to do ideally, and all other tests verify it can do it in non-ideal environments.

s_ariga4y ago

#include <stdio.h> #include <stdlib.h>

int main(void) { if(puts("Hello, World!")!=EOF) { return EXIT_SUCCESS; }else { return EXIT_FAILURE; } }

PennRobotics4y ago

puts() returns 13 with no pipe and with pipe to /dev/full, which I just learned is due to buffering.

What worked for me initially was the POSIX write() function:

  #include <stdlib.h>
  #include <unistd.h>
  
  int main(void)
  {
      int status;
      status = write(1, "Hello World!\n", 13);
      if (status < 0)  { return EXIT_FAILURE; }
      return EXIT_SUCCESS;
  }

-----

As someone else commented, fflush() gives the desired error response.

  #include <stdio.h>
  #include <stdlib.h>
  
  int main(void)
  {
      int status;
      puts("Hello World!");
      status = fflush(stdout);
      if (status < 0)  { return EXIT_FAILURE; }
      return EXIT_SUCCESS;
  }

-----

andreyv probably has the best alternative[1], which is checking fflush() and ferror() at the program's end and calling perror(). It's better because it outputs an actual error message on the current terminal, and you don't need to write a special error checking wrapper.

[1] https://news.ycombinator.com/item?id=30611924

unwind4y ago

I tried basically exactly that, and it didn't work for me.

On my test system (Ubuntu 21.10 on x86_64) the puts() call never fails.

I switched to a raw write() and that successfully catches it, by returning -1 when output is redirected to /dev/full.

Quite interesting, actually.

Karellen4y ago

puts(), like printf() and all the C-standardised "stdio" functions use buffered writes. So that is also buggy, because the buffer won't be flushed until after main() returns. You need to call and check the return value of "fflush(stdout)" manually to get the correct result.

jwilk4y ago

This still succeeds, because puts() is buffered.

(And silently returning non-zero would be bad anyway.)

asojfdowgh4y ago

And not an edge case in the slightest either.

I'm going to have to go back over all the print statements I've ever written now

unixbane4y ago

Just wait til DAY OF THE SEAL.

RcouF1uZ4gsC4y ago

What happens if in the original hello world example, you unplugged the monitor?

Would any of the languages report an error?

Maybe they all have bugs.

kaycebasques4y ago

The article doesn't explain how to fix the bug! Talk about leaving your audience hanging.

Too4y ago

It did. There was a long table of languages not having the bug at the end. ;)

fouronnes34y ago

Curious if Zig's "software should be perfect" suffers from this.

boloust4y ago

It does not.

kbd4y ago

Indeed https://news.ycombinator.com/item?id=30611768

edejong4y ago

Correct solution:

  printf '#include <stdio.h>\nint main() { return printf("Hello world!\\n") && fflush(stdout); }\n' | cc -xc - && ./a.out > /dev/full && echo "Success\!"

Someone4y ago

Shouldn’t that be

  return (printf("Hello world!\n") < 0) && fflush(stdout);

printf returns the “number of characters transmitted to the output stream or negative value if an output error or an encoding error (for string and character conversion specifiers) occurred”, so it won’t ever return zero for that call (https://en.cppreference.com/w/c/io/fprintf)

I also think this optimally should do something like

  int x = printf("Hello world!\n");
  if(x<0) return x; // maybe fflush here, too, ignoring errors?
  return fflush(stdout);

Logging an error message to stderr should be considered, too. I would ignore any errors from that, but attempting to syslog those or to write them to the console could be a better choice.

darkerside4y ago

TIL about /dev/full

Could have used this knowledge in the past...

incanus774y ago

I’ve been using Linux for 24 years and didn’t know about this. Mind blown.

sedatk4y ago

Did the author just assume the spec of Hello World?

8n4vidtmkvmk4y ago

ok but how does one fix this bug in c or c++?

steerablesafe4y ago

`puts` has a return value indicating success or failure.

edit: https://cigix.me/c17#7.21.7.9.p3

bradwood4y ago

Here's another one:

10 PRINT "Hell world"

steerablesafe4y ago

You can't have bugs if you don't have a specification. [insert meme here]

thedatamonger4y ago

Bravo! I enjoyed this.

inopinatus4y ago

tldr: always check the return value of system calls.

(also, printf is buffered, so close or flush your output)

karolist4y ago

The author missed another bug, which many others do. You need a comma after Hello, as in "Hello, World!" because it's a direct address. Very, and I mean VERY few books get this right.

0. https://www.grammar-monster.com/lessons/commas_with_vocative...

dylan6044y ago

Nah, it's an oxford comma.

karolist4y ago

Being a non-native speaker I had to look it up. Wikipedia says it's optional in British English but American English encourage, and sometimes mandate the use of it. If you look at https://commons.wikimedia.org/wiki/File:Hello_World_Brian_Ke..., which the author uses in their post - the comma's there. So why the inconsistency?

dylan6044y ago

Yeah, I wasn't really serious seeing as the Oxford comma applies to a list of items. It's something that has never set well with me, the Oxford commma, as it makes the last item ambiguous when not present. So I now just throw out "Oxford comma" anytime there's a question on if a comma is needed or not.

j / k navigate · click thread line to collapse

252 comments

HellsMaddy4y ago

You all joke that this doesn’t happen in practice, but something like this literally just bit me and it took me a few too many minutes to figure out what was going on.

Luckily this wasn’t a production app. But I know now it could just as feasibly happen in production, too.

I opened an issue[1] and it was fixed very quickly. I love open source!

*No disrespect to urlscan, it’s an awesome tool and bugs happen to all of us!

[0]: https://github.com/firecat53/urlscan

[1]: https://github.com/firecat53/urlscan/issues/122

underdeserver4y ago

> I use a bash script as my BROWSER which calls another bash script to launch or communicate with my browser that I run inside a container.

I'm not sure return codes are the source of your troubles...

xelxebar4y ago

Interesting bug you found!

https://mywiki.wooledge.org/BashFAQ/105

(The list begins below the anecdote.)

shikoba4y ago

> some other people disagree and say you should never use `set -e` outside of development

I'm really interested. What are their arguments? And how do they handle errors?

bombcar4y ago

See https://mywiki.wooledge.org/BashFAQ/105 for examples of side effects.

deathanatos4y ago

> but in production you may want it off to reduce strange side-effects

Having -e set is to reduce strange side-effects, by having the script fail, instead of plowing headlong into the land of undefined/unexpected behavior.

> See https://mywiki.wooledge.org/BashFAQ/105 for examples of side effects.

shikoba4y ago

> not that the command returned 0 but that the file it made exists and is the right length

If a command returns 0 when it didn't really do its job. Shouldn't we fix the command instead of the script?

1 more reply

tedunangst4y ago

necrotic_comp4y ago

I'm actually really curious about this setup. Can you go into a bit more detail about how it works ?

HellsMaddy4y ago

[0]: https://wiki.archlinux.org/title/Systemd-nspawn

[1]: https://github.com/b0o/arch-lwc

unqueued4y ago

xlii4y ago

I’m disappointed. I expected some obscure edgecase (like “Main is usually a function…” [1]) but instead that’s about scope handling, contract design and responsibility shift.

Now allow me to excuse myself, I need to write an e-mail to my keyboard manufacturer because it seems like it has a bug which prevents it from working when slightly covered in liquid coffee.

[1]: http://jroweboy.github.io/c/asm/2015/01/26/when-is-main-not-...

matheusmoreira4y ago

I still agree with the author though. This is a serious matter and it seems most of the time the vast amount of complexity that exists in seemingly simple functionality is ignored.

Here's all the stuff people need to think about in order to handle the many possible results of a single write system call on Linux:

  long result = write(1, "Hello", sizeof("Hello") - 1);

  switch (result) {
    case -EAGAIN:
      /* Occurs only if opened with O_NONBLOCK. */
      break;
    case -EWOULDBLOCK:
      /* Occurs only if opened with O_NONBLOCK. */
      break;
    case -EBADF:       
      /* File descriptor is invalid or wasn't opened for writing. */
      break;
    case -EDQUOT:
      /* User's disk quota reached. */
      break;
    case -EFAULT:
      /* Buffer points outside accessible address space. */
      break;
    case -EFBIG:
      /* Maximum file size reached. */
      break;
    case -EINTR:
      /* Write interrupted by signal before writing. */
      break;
    case -EINVAL:
      /* File descriptor unsuitable for writing. */
      break;
    case -EIO:
      /* General output error. */
      break;
    case -ENOSPC:
      /* No space available on device. */
      break;
    case -EPERM:
      /* File seal prevented the file from being written. */
      break;
    case -EPIPE:
      /* The pipe or socket being written to was closed. */
      break;
  }

Some of these are unlikely. Some of these are irrelevant. Some of these are very important. Virtually all of them seem to be routinely ignored, especially in text APIs.

Perseids4y ago

dfox4y ago

matheusmoreira4y ago

You're right. The manual contains this ominous notice at the very end of the errors section:

https://man7.org/linux/man-pages/man2/write.2.html

> Other errors may occur, depending on the object connected to fd.

I don't understand why every possible result isn't explicitly documented. This is the Linux system call interface, we need to know everything that could happen when we make these calls.

1 more reply

boloust4y ago

The bug is not that the program failed, it's that the program failed but reported a success.

KnobbleMcKnees4y ago

Failure isn't defined by programmed control structures, it's defined by requirements and implemented via programming.

If the requirements of a hello world program include accounting for all error boundaries of the host system, then I am yet to see them written down but would invite anyone to provide them.

The parent comment has made a start in this regard.

yawboakye4y ago

2 more replies

Cthulhu_4y ago

This particular category of bug is something people - those making up the requirements too - probably wouldn't even consider.

1 more reply

ptsneves4y ago

dotancohen4y ago

To quote one of my favorite books:

  > [Hello, world] is the big hurdle. To leap over it you have to be able to
  > create the program text somewhere, compile it successfully, load it, run
  > it, and find out where your output went.

It's fine that the error is not handles, just as it is fine that the output went to stdout. Error handling was not a goal of the program.

tialaramex4y ago

> you have to be able to create the program text somewhere, compile it successfully, load it, run it, and find out where your output went.

hnlmorg4y ago

> Those are the goals of "Hello, world!". Create the program, compile it, load it, run it, and find the output.

(Emphasis mine)

How are you going to find the output if there is an error outputting it and you're not capturing that?

Given the requirements you've given, that would absolutely make error handling mandatory in my opinion.

1 more reply

oneeyedpigeon4y ago

I guess the argument is that the non-error-checking version fails at the "find out where your output went" stage. hello.c gives the impression that your output went to the file, even when it didn't.

Without a spec, I think it would be harsh to claim hello.c is wrong. But handling the error—in this case, returning it from main to the shell via an exit code—is definitely more correct.

nojs4y ago

geocar4y ago

It seems possible they are arguing:

    _exit(write(1,"Hello World!\n",13));

is the correct one, but to me that just kicks the can. What should happen here?

    os.rename(x,y)
    print("success!")

But I also know the difference between /dev/full and /dev/null

1 more reply

bmacho4y ago

The computer will do what you ask it to do, it's only a bug, when it doesn't meet your expectations.

pickledcods4y ago

The program failed to check the return value of the called function.

boloust4y ago

No, I think the argument was pretty clearly that "the program failed but it can't possibly be expected to work under every conceivable scenario".

usrbinbash4y ago

Whether the program fails or not, is a matter of specification.

    printf("Hello, World!\n")

ryandrake4y ago

    (void) printf("Hello, World!\n");

Although, in general, ignoring the return value from things like puts() and printf() is a bad idea, for reasons the article makes clear.

1 more reply

pdw4y ago

Beltiras4y ago

To me it was less disappointing and more intriguing.

Too4y ago

It’s not about handling the error, it’s about propagating unexpected error. Because most errors are that; unexpected.

Modern languages do this by default, using exceptions, or force you to check return values using Result<> or alike.

Even in C, when compiled through some more strict linter, this would fail because ignored return value should be prefixed with (void).

In either case I think the main takeaway from the article is that a language where even hello world has such pitfalls, isn’t suitable, given the many other better options today.

nebulous14y ago

LadyCailin4y ago

Sounds like we need to use https://github.com/Hello-World-EE/Java-Hello-World-Enterpris... to cover all our bases.

yawboakye4y ago

usrbinbash4y ago

Enjoyable read for sure, but i think the question whether ot not this constitutes a bug or not is open for interpretation.

IMHO, it doesn't.

If I expect a program to do something (eg. handle IO errors) that its code says very clearly that it doesn't, that's not the programs fault.

stonemetal124y ago

>If I expect a program to do something (eg. handle IO errors) that its code says very clearly that it doesn't, that's not the programs fault.

Is there no such thing as a bug then? The program does what the code says so every "misbehavior" and crash is expected behavior.

usrbinbash4y ago

There is a difference between a program with a specification/documentation outlining what it should do, and one that doesn't have a spec.

AFAIK, hello.c doesn't have a spec, so the code is the spec. If I am using it, I have to read the code to know what it does.

admax88qqq4y ago

So if I don't write a spec, I have no bugs. Got it.

2 more replies

parksy4y ago

oneeyedpigeon4y ago

User234y ago

myrrlyn4y ago

given that the sole purpose of software is to do what people want, "it didn't do what i want" is automatically incorrect behavior

1 more reply

rplnt4y ago

So the question boils down to: Is hello world a program that is supposed to write hello world or is it a program that is supposed to (compile and) start? For me it's usually the latter.

shikoba4y ago

ryandrake4y ago

Find the bug!

    puts("This is a log");

I think everyone can agree that the above should be considered a bug in any kind of non-hello-world production code, for the reason the article mentioned.

1 more reply

bombcar4y ago

Exactly. When you’re done with hello world you’ve solved some major problems:

1. How to store the code in a file

2. How to find and use the compiler and linker

3. How to run compiled code

1 more reply

hombre_fatal4y ago

Of course, it’s also a way to show even experienced devs the most basic boilerplate to begin working from in something new.

What are the starting incantations and how do I run it.

l33t23284y ago

You shouldn’t have to read the code to understand what a program does, from an external POV at least.

usrbinbash4y ago

If the program comes with a specification then yes, ideally that should be the case.

If there isn't one however, then the code is all there is.

layer84y ago

For didactic reasons it’s preferable to consider it a bug.

usrbinbash4y ago

Sure, we could update it:

    // hello_v2.0.c
    #include <stdio.h>
    #include <string.h>
    #include <errno.h>

    int main(void) {
        printf("Hello, World!\n");
        fflush(stdout);
        if (errno != 0) {
            fprintf(stderr, "error: %s\n", strerror(errno));
            return errno;
        }
    }

To someone who is already experienced in another language, that may not seem like a big deal, and isn't, but to someone who encounters the language for the first time, this is heavy stuff.

nextaccountic4y ago

This just demonstrates that C is an awful programming language for writing correct programs.

Compare this with Rust, where the usual hello world will just do the right thing:

    $ cat > a.rs
    fn main() {
        println!("Hello World");
    }
    $ rustc a.rs
    $ ./a            
    Hello World
    $ echo $?
    0
    $ ./a > /dev/full                                                                                                                                
    thread 'main' panicked at 'failed printing to stdout: No space left on device (os error 28)', library/std/src/io/stdio.rs:1187:9
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    $ echo $?                                                                                                                                     
    101

1 more reply

paavoova4y ago

Your usage of errno can cause failure on success if errno was set any time before the call to fflush. Might need a bump to v3 for robustness...

1 more reply

layer84y ago

This is much simpler:

  int n = printf("Hello, World!\n");
  return n < 0 ? EXIT_FAILURE : EXIT_SUCCESS;

1 more reply

pron4y ago

> Unlike other output streams, a PrintStream never throws an IOException; instead, exceptional situations merely set an internal flag that can be tested via the checkError method.

So the correct Hello World would be:

    System.out.println("Hello World!");
    if (System.out.checkError()) throw new IOException();

[1]: https://docs.oracle.com/en/java/javase/17/docs/api/java.base...

barrkel4y ago

It's a behavioural side-effect of checked exceptions. Because IOException is a checked exception, throwing it for console output would cause a lot of pain for printf debugging.

pron4y ago

    System.out.withErrorChecks().println("Hello World!);

But we can't change the behaviour of the existing PrintStream.

barrkel4y ago

I think you mean UncheckedIOException - https://docs.oracle.com/javase/8/docs/api/java/io/UncheckedI...

RuntimeIOException from Jira has an amusing (and incorrect, IMO) message:

https://docs.atlassian.com/software/jira/docs/api/7.6.1/com/...

    > An IOException was encountered and the stupid programmer didn't know 
    > how to recover, so this got thrown instead.

1 more reply

winstonewert4y ago

mccorrinall4y ago

Imo this is because the responsibility is not clearly defined and can be argued upon.

If my program writes to the standard output, but you choose to redirect the pipe to a different location, is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

After all: my program did output everything as expected. The part which fucked up was not part of my program.

I can see why some projects decide to not handle this bug.

usrbinbash4y ago

> but you choose to redirect the pipe to a different location

CraigJPerry4y ago

I think this is pretty cut and dried - the failure is inside your process’s address space and the programmer error is that you haven’t handled a reported error.

>> what happens to the bytes AFTER the pipe?

There isn’t a pipe involved here, when your process was created it’s stdout was connected to dev/full then your program began executing

usrbinbash4y ago

> you didn’t handle the possible error condition, you just silently ignored it.

Problem is, the error condition is not even that obvious. I tried it, and printf() will happily return the number of bytes written, even when redirecting stdout to /dev/full.

hnlmorg4y ago

I don’t personally agree with that judgement. While the failure condition is at the OS level it’s still affecting the function of the program an in unexpected way.

contradictioned4y ago

With that in mind, is this criticism of the Java hello world valid? Its output abstracts more than stdout and maybe in Windows this but would not occur. (I don't know, just discussing)

oneeyedpigeon4y ago

The fact that there's redirection is a ... misdirection. The redirection is only used to proxy a real-life case that can happen even when no redirection is taking place.

autoexec4y ago

What's the real-life case where hello world fails because the file system is full?

oneeyedpigeon4y ago

As I said, not hello world necessarily, but any program that writes output can encounter this problem, and there's an overlap with general file-writing even.

ygra4y ago

mirekrusin4y ago

dataflow4y ago

> is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

No, but it's not "after". Rather, it's your responsibility to handle backpressure by ensuring the bytes were written to the pipe successfully in the first place.

hvdijk4y ago

dataflow4y ago

That would be at best a Linux extension, not some general C behavior you can assume when writing your program.

That said though, I can't even reproduce what you're saying on Linux:

  printf '%s\n' '#include <stdio.h>' 'int main() { setvbuf(stdout, NULL, _IONBF, 0); int r = fputs("Starting\n", stdout); fflush(stdout); fprintf(stderr, "%d\n", r); }' | cc -x c - && ./a.out >&-
  // Prints '0' instead of dying

1 more reply

hgomersall4y ago

masklinn4y ago

> If my program writes to the standard output, but you choose to redirect the pipe to a different location, is it my program’s responsibility to check what happens to the bytes AFTER the pipe?

The pipe is your standard output. Your very program is created with the pipe as its stdout.

> After all: my program did output everything as expected. The part which fucked up was not part of my program.

But you are wrong, your program did not output everything as expected, and it failed to report that information.

bestouff4y ago

Well that's precisely the mindset of C/C++. You have to think by yourself about everything that can go wrong with your code. And, man, lots of things can go wrong.

I find more modern languages so much less exhausting to use to write correct code.

onion2k4y ago

I find more modern languages so much less exhausting to use to write correct code.

masklinn4y ago

But GP’s point is that modern languages can surface those issues and edge cases, and try to behave somewhat sensibly, but even sometimes “magically” report the edge cases in question.

tialaramex4y ago

Indeed. One of the things you notice when writing say, Advent of Code solutions in Rust is that you're writing unwrap() a lot e.g. something like

let geese = usize::from_str_radix(line.strip_prefix("and also ").unwrap().strip_suffix(" geese.").unwrap(), 10).unwrap();

1 more reply

jll294y ago

Exactly. And isn't UNIX all about writing tools that can be used in unintended ways and combinations?

jcelerier4y ago

Considering that awk and TCL do not have the bug in question, while java, ruby, and node.js do, I'm not sure this can be framed in terms of modernity

josefx4y ago

pdw4y ago

Well, I'd say that the older languages mostly get it right, while the newer languages mostly fail.

ketzu4y ago

Just taking some simple release dates [1] or wikipedia I found:

Ages of "Yes" group: 49, 36, 22, 26, 26, 12, 31

Ages of "No" group: 11, 14, 33, 7, 32, 45, 26, 34, 21

Averages: Yes 28.85, No 24.78

With the ambiguity around what "age" even means for the language here (e.g., counting the age of node.js or python) it is probably meaningless, but it seems well mixed independent of age.

[1] https://blog.sunfishcode.online/bugs-in-hello-world/

1 more reply

jollybean4y ago

The hidden costs are enormous and to this day still not very well accounted for.

Sohcahtoa824y ago

C/C++ basically do only exactly what you tell them and nothing more, which is why they're so much faster than other languages.

kazinator4y ago

In this case, I don't see how they help.

A modern language could automatically throw an exception if the string cannot be completely written to standard output.

But that has not necessarily helped. The program now has a surprising hidden behavior; it has a way of terminating with a failed status that is not immediately obvious.

If it is used in a script, that could bite someone.

In Unix, there is such an exception mechanism for disconnected pipes: the SIGPIPE error. That can be a nuisance and gets disabled in some programs.

shikoba4y ago

In C, yes. But for C++ you can use exceptions if you want. You just have to "protect" all the "primitives" and after that you're safe.

em3rgent0rdr4y ago

And the mindset can be to treat such conditions as "Don't Care" to allow the C/C++ code to generate more optimized assembly.

josefx4y ago

hiccuphippo4y ago

This is one of the reasons zig doesn't have a print function and you have to deal with the error when using the stdout writer[0].

[0] https://zig.news/kristoff/where-is-print-in-zig-57e9

abainbridge4y ago

Came here to say the same. Zig gets it right. https://ziglang.org/documentation/master/#Hello-World

avar4y ago

Doing something similar would be a good addition to any non-trivial C program that emits output on stdout and stderr.

In practice I haven't really seen a reason to exhaustively check every write to stdout/stderr as long as standard IO is used, and fflush() etc. is checked.

Anthony-G4y ago

avar4y ago

This is really more about POSIX and FS semantics than C (although ultimately you end up using the C ABI or kernel system calls, which are closer to C than e.g. Python).

But none of that is portable, and you might start losing data on another OS or FS. The only thing that's portable is exhaustively checking errors after every system call, and acting appropriately.

Anthony-G4y ago

yesenadam4y ago

I couldn't see GNU Hello mentioned in the article or comments so far. I wonder how it fares bug-wise.

The GNU Hello program produces a familiar, friendly greeting. Yes, this is another implementation of the classic program that prints “Hello, world!” when you run it.

https://www.gnu.org/software/hello/

mcbrit4y ago

This is explicitly called out and handled in lines 151-155.

https://git.savannah.gnu.org/cgit/hello.git/tree/src/hello.c

Here's the comment:

  /* Even exiting has subtleties.  On exit, if any writes failed, change
     the exit status.  The /dev/full device on GNU/Linux can be used for
     testing; for instance, hello >/dev/full should exit unsuccessfully.
     This is implemented in the Gnulib module "closeout".  */

pixelbeat__4y ago

Jim Meyering's discussion on how this is handled usually in GNU programs

https://www.gnu.org/ghm/2011/paris/slides/jim-meyering-goodb...

yesenadam4y ago

Thank you.

tgv4y ago

hnlmorg4y ago

It depends on whether you want your Hello World programs to reflect an actual program or just be an approximation.

dmurray4y ago

> It depends on whether you want your Hello World programs to reflect an actual program or just be an approximation. I’d argue there is little benefit in the latter.

Handling terminal output is just an extra nice-to-have at that point, and one convenient way to verify your tools are working. Correct error handling is definitely out of scope.

hnlmorg4y ago

Interesting take but I see two problems with that:

What you're describing is effectively a behavioral test, not a Hello World example.

knorker4y ago

> a hyperbole

It's not, though.

This helloworld is not safe to use as part of something bigger. Like:

    echo header > gen.txt && ./helloworld >> gen.txt && ./upload_to_prod gen.txt