Now, I quite seriously believe that a 1000 line shell script only exists out of error. I still occasionally end up doing 2-300 line dense shell scripts, but not without feeling very dirty along the way. Either split into small, simple shell scripts (which is fine), or a different language.
In the cross platform build pipeline at work, I keep a strong discipline when it comes to scripts: They must be short (<=100 lines), and if their complexity exceeds a certain threshold (parsing more than a "grep | cut" here and there, total program states exceeding some low number), then a shell script is no longer acceptable regardless of length. And, well, it's not safe to assume the presence of anything more than a shell.
If you are writing and dealing with 1000+ lines of shell scripts, then experience tells me that you are shooting yourself in the foot. With a gatling gun.
(I used fish, btw. The interactive experience was nice, but the syntax just felt different without much gain, which was frustrating to someone who often writes inline oneliners. Unlearning bash-isms is not a liberty I can afford, as I need to be proficient when I SSH into a machine I do not own or control. I can't force the entire company to install fish on all our lab servers, nor is it okay to install a shell on another persons' dev machine just because we need to coorperate.)
My rough heuristic is, "no more than ten lines, nor more than two variables." Yes, that's short almost to the point of absurdity. The only good thing that one can say about Bash as a scripting language is that it's better than csh. Bash is taken seriously because of its longevity and ubiquity, but it's fundamentally limited in what it can express, and it is quite trivial to exceed those limitations.
It's definitely easier/safer to write longer programs in other languages, but with proper discipline it's also possible to write really big, robust programs in Bash/shells--people have done this, and some of that code is probably still running today.
A lot of that code is still being written today, which is one of the things that OP is trying to change.
It's funny though, after decades of shying away from them I've now gone full circle and embraced Makefiles once more. Perhaps it's the cleanness of the bmake/pmake extensions as compared to those of GNU Make, but the determinism, zero dependencies (no CMake or meson and its python baggage), automatic parallelization, and strict error handling call to me each time I have to write mission critical "glue code" that must. just. work.
That's a fair point. Nothing beats shells for ubiquity and supportedness everywhere . . . in theory.
In practice, even people who write "portable shell scripts" (almost) never really write portable shell scripts.
If you learn the POSIX sh standard like the back of your hand, and stick only to the features in it, never using bashisms or equivalent, most shellscripts still rely on external programs (even if only grep/sed/etc) to do their heavy lifting.
And that's where you get into trouble. Because compared to the variability in behavior of even "standard" ultra-common programs, the variability in behaviors between shell syntaxes/POSIX-vs-non-POSIX shells is tiny. The instant your code invokes an external program, no matter how ubiquitous that program is, you have to worry about:
- People screwing with PATH and changing the program you get.
- People screwing with variables that affect the external program's behavior: LD_LIBRARY_PATH for dynamic executables, or language-specific globals for programs' runtime behavior (CLASSPATH, PERL5LIB, PYTHONPATH, etc.).
- External program "editions" (e.g. non-GNU sed vs GNU sed). Good luck using PCRE with a non-GNU "grep"! Oh, and if you're sticking to POSIX-only shell semantics (no bashisms), you don't even get full BREs in most places you need them in the shell; you're stuck with limited BRE or globbing, which makes editing 100-line PCRE regexes feel like a dream.
- External program per-version differences.
- External program configuration.
- Etc.
Dealing with those issues is where "self-test"/"commandline program feature probe" things like autotools really shine. Raw shellscripts, though, very seldom live up to the "ubiquity" promise.
After learning and using many other "modern" Makefile replacements of various complexities (ninja, cmake, scons, tup, meson, bazel, and others) I realized if you're not using exactly what the tool was designed for, you end up recreating a Makefile (e.g.: meson is awesome for cross-platform C++ builds, but if you try to use it to build an environment composed of output from various processes that aren't C++ compilers, writing a Makefile is easier). CMake, apart from also being too C/C++-specific, would be nice if it didn't require a million different files and didn't have such a god awful language.
The only one I liked as a general purpose build system was ninja but it is too restrictive to code in (requires duplication of code by design, not meant to be written by hand though I still do from time to time) and tup, but tup is built around fuse (instead of kqueue/inotify/FindFirstChangeNotification) and so is a no-go for anything serious.
Once I embraced Makefiles, it turned out that most things traditionally built with shell scripts should actually be Makefiles for determinism. For example, I just used bmake to take a clean FreeBSD AMI and turn it into the environment I need to perform some task every n intervals (the task itself was turned to a Makefile rule), in place of where Puppet and other tools would normally have been needed, but would have been overkill for my needs.
The only drawback to Makefiles that I haven't found a clean solution to is when you need to maintain state within a rule (without poisoning the global environment). The only solutions I can see are either a) using a file to store state instead of a variable, which is just stupid, b) calling a separate shell script (with `set -e; set -x` to try and mimic Make behavior) which sort of defeats the point of Make, and c) multiline rules with ;\ everywhere, which is hideous and error-prone but works (though it makes debugging a nightmare as the rule executes as one command, again defeating some of the benefits of Make).
Perhaps the best use case for Oil is to provide a debugging environment where you can figure out what your legacy shell scripts are doing, and rewrite them in another language.
> Unlearning bash-isms is not a liberty I can afford, as I need to be proficient when I SSH into a machine I do not own or control.
This is a slippery slope. I have heard things like, "don't make your own custom aliases / shell functions, because they won't be available when you SSH to another machine." Forcing yourself to always use the lowest common denominator of software is not a fun path.
(Although note that bash actually has a debugger called bashdb, which I didn't know about until recently, and I've never heard of anyone using it.)
One intermediate step I would like to take is to provide a hook to dump the interpreter state on an error (set -e). Sort of like a "stack trace on steroids".
If anyone is running shell cron jobs in the cloud and would like this, please contact me at andy@oilshell.org.
I want to give people a reason to use OSH -- right now there is really no reason to use it, since it is doing things bash already does. But I think a stack trace + interpreter dump in the cloud (like Sentry and all those services) would be a compelling feature. I'm looking for feedback on that.
Shell scripts have a very small amount of in-process state, so it should be easy to dump every variable in the program.
Also some kind of logging hook might be interesting for cloud use cases.
Second this. Been there, and back, and there again, and recently back again.
Customize the hell out of your shell, make it your place, make it nice. You're spending your day there, every day. Treat it like you'd treat your work desk.
If it's a stranger's machine, well, OK, suffer through the 10 minutes of troubleshooting, an occasional session with busybox also helps keep "pure POSIX" skillset fresh. If you're becoming a regular there, it's time to think how to make your dotfiles portable. google.com/search?q=dotfiles
OTOH, I’m also against, for example, installing zsh fleetwide in production fleets to accommodate choosy folks. So I’m on both ends of the problem, and know it.
My setup is customized, but the core experience is kept relatively untouched. If someone experiences a problem, I need to be able to debug it in place efficiently. I write drivers for the company's hardware for a living—troubleshooting takes more than 10 minutes.
My approach to that problem is to just instrument scripts to write out the command line every time a tool is called, and infer the logic behind it. When I started here, build/package/test on Windows were controlled by 6000 lines of Perl... A printing version of system() and a few days later, it's 40 lines of batch and 150 lines of Python.
> Forcing yourself to always use the lowest common denominator of software is not a fun path.
Never said it was fun. Also, I need to be able to use the lowest common denominator, I don't necessarily have to be the lowest common denominator. I use a configured zsh and nvim instance myself, but I take care to ensure that if I'm stuck with vi and (ba|da|k|c)sh, I'm still productive. The core behavior of my VIM instance stays close to stock, but it has a bunch of extras like fuzzy search, linting, a nice theme, etc.
And, if I can make a choice without caring, my needs for a shell and a programming language are in direct opposition. I want my shell to put all of its energy into a useful interactive experience.
If you're logging into a machine to the point you're editing files or require your customizations in this day and age you're doing it wrong IMHO.
Well, the question is: what do you use as a replacement? For my scripts I tend to pull in some version of PHP as soon as needed. Perl is something that's universally available in anything based on Debian or Ubuntu, but it's a maintenance nightmare, and PHP doesn't care whether I use spaces, tabs or a mixture, or if there are due to any circumstances mixed line endings in a file whereas Python may or may not barf on encountering any of said things...
https://github.com/roswell/roswell
(As a side-note: Perl 5 was really the ideal language for this, by design; we might be in a different world today if it hadn't been derailed by Perl Forever^W 6).
Anyway, I usually avoid perl. While definitely a true programming language, I really don't like it. I find that it lacks clarity, and quickly collapses into bash-like hackery.
I personally use Python when it's a notch above shell, but not enough to pull out the big guns.
(I use to write a lot of perl when I was a kid, even made an accounting system in it, so my dislike of perl is from experience. We also had a 6000 line perl build script at my current job which was the last drop. Rewrote it to 40 lines of batch and 150 lines of Python. Likewise, I also wrote a lot of PHP, and can no longer consider it a proper programming language.)
Are you claiming that its size was because Perl forced it to be long winded?
http://www.oilshell.org/blog/2018/01/28.html#toc_7
I didn't write those scripts, and you rely on them, whether you know it or not.
Do you use Unix? Do you use the cloud? You rely on them. See:
https://www.reddit.com/r/linux/comments/7lsajn/oil_shell_03_...
Also, shell has functions.
However I do think it is a disaster that people can't break up their shell code into more modular components, and there is no excuse for it.
And I especially do not wish to discourage you from trying to reinvent shell, because someone has to do it, but at minimum I don't see myself moving away from my shell for the better part of a decade. Hopefully at that point Oil's community has matured to the point where I feel comfortable trusting it to be stable and free of any critical privilege escalation bugs.
Good luck!
People also implement new SQL injection bugs every day—just because a lot of people do it doesn't mean we should just let it slide.