> Bash forgot to reset errno before the call. For about 30 years, no one noticed
I have to say, this part of the POSIX API is maddening!
99% of the time, you don't need to set errno = 0 before making a call. You check for a non-zero return, and only then look at errno.
But SOMETIMES you need to set errno = 0, because in this case readdir() returns NULL on both error and EOF.
I actually didn't realize this before working on https://oils.pub/
---
And it should go without saying: Oils simply uses libc - we don't need to support system with a broken getcwd()!
Although a funny thing is that I just fixed a bug related to $PWD that AT&T ksh (the original shell, that bash is based on) hasn't fixed for 30+ years too!
(and I didn't realize it was still maintained)
https://www.illumos.org/issues/17442
https://github.com/oils-for-unix/oils/issues/2058
There is a subtle issue with respect to:
1) "trusting" the $PWD value you inherit from another process
2) Respecting symlinks - this is the reason the shell can't just call getcwd() !
if (*p != '/' || stat(p, &st1) || stat(".", &st2) ||
st1.st_dev != st2.st_dev || st1.st_ino != st2.st_ino)
p = 0;
Basically, the shell considers BOTH the inherited $PWD and the value of getcwd() to determine its $PWD. It can't just use one or the other!ksh _was_ unmaintained for ages. It stopped effectively in 2012, with some very small attempts at reviving it in 2016, 2018 and 2020.
Then it was picked up for active development in 2021, and it lives here now:
If you didn't already, you should open an issue there with your findings.
--
Shameless plug: I keep some docker images with all those versions for testing, and many other shells too both historical and active (including osh!)
https://github.com/alganet/shell-versions/blob/main/.github/...
https://pages.oils.pub/spec-compat/2025-06-19/renamed-tmp/sp...
(which I created)
Even the upcoming bash 5.4 implement ksh command sub ${ echo hi; }, which is more evidence that bash is based on ksh.
They're still implementing ksh 35 years later ...
Instead there's objections on the basis "filesystems shouldn't work like that".
The person who responded dismissively later says "I'm just another user."
---
Every commit since they started using git in 2009 is attributed to one person:
https://cgit.git.savannah.gnu.org/cgit/bash.git/log/
I think occasionally contributed patches are applied, but this is not apparent in source control.
I was attacked on the bash mailing list a several years ago, so I don't go there anymore :-)
Not like setting errno=0 before calling readdir() is a novel idea...
Here it is: https://lists.gnu.org/archive/html/bug-bash/2025-06/msg00149...
[1] https://github.com/NixOS/nixpkgs/commit/dff0ba38a243603534c9...
Also, that getcwd.c which contains the getcwd() fallback and bug is in K&R C, which should be a hint at how well maintained all of this is. Bash takes "don't fix it if it ain't broke" to new levels, to the point of introducing breakage like here (the bash-malloc is also notorious for this – no idea why that's still enabled by default).
For a long time, inode numbers from readdir() had certain semantics. Supporting overlay filesystems required changing those semantics. Piles of software were written against the old semantics; and even some of the most common have not been upgraded.
What there are piles of, are softwares that reinvent the C library, all too often in little bits of conditionally-compiled code that have either been reinvented or nicked from some old C library and sit unused in every platform that that application is nowadays ported to. Every time that I see a build log dutifully informing me that it has checked for <string.h> or some other thing that has been standard for 35 years I wonder (a) why that is thought to be necessary in 2025, and (b) what sort of shims would get used if the check ever failed.
Most programs will probably just fail to compile: "#undef HAVE_STRING_H" gets added to config.h, but it's never checked. Or something along those lines. It's little more than "failed to find <string.h>" with extra steps.
The exceptions are older projects which support tons of systems: bash, Vim, probably Emacs, that type of thing. A major difficulty is that it can be very hard to know what is safe to remove. So to use your strings.h example, bash currently does:
#if defined (HAVE_STRING_H)
# include <string.h>
#endif /* !HAVE_STRING_H */
#if defined (HAVE_STRINGS_H)
# include <strings.h>
#endif /* !HAVE_STRINGS_H */
And Vim has an even more complex check: // Note: Some systems need both string.h and strings.h (Savage). However,
// some systems can't handle both, only use string.h in that case.
#ifdef HAVE_STRING_H
# include <string.h>
#endif
#if defined(HAVE_STRINGS_H) && !defined(NO_STRINGS_WITH_STRING_H)
# include <strings.h>
#endif
Looks like that NO_STRINGS_WITH_STRING_H gets defined on "OS/X". Is that still applicable? Probably not?Is any of this still needed? Who knows. Is it safe to remove? Who knows. No one is really tracking any of this. There is no "caniuse" for this, and even the autoconf people aren't sure on what systems autoconf does and doesn't work. There is no way to know who is running what on what, and people do run some of these programs on pretty old systems.
So ... people don't touch any of this because no one knows what is or isn't broken and what does and doesn't break if you touch it.
Aside: people love to complain about telemetry, sometimes claiming it's never useful, but this is where telemetry would absolutely be very useful.
It looks easy on the surface to roll down support for any kind of operating system there is, based on auto-detection and then #if HAVE_THIS or #if HAVE_THAT, but it breaks in ways that maybe really hard to untangle later.
I'd rather have a limited set set of configurations targeting specific platforms/flavors, and knowing that no matter how I compile it, I would know what is `#define`-d and what is not, instead of guessing on what the "host" might have.