undefined | Better HN

0 pointsarghwhat8y ago0 comments

Back when I started programming, I wrote 1000+ lines of shell scripts.

Now, I quite seriously believe that a 1000 line shell script only exists out of error. I still occasionally end up doing 2-300 line dense shell scripts, but not without feeling very dirty along the way. Either split into small, simple shell scripts (which is fine), or a different language.

In the cross platform build pipeline at work, I keep a strong discipline when it comes to scripts: They must be short (<=100 lines), and if their complexity exceeds a certain threshold (parsing more than a "grep | cut" here and there, total program states exceeding some low number), then a shell script is no longer acceptable regardless of length. And, well, it's not safe to assume the presence of anything more than a shell.

If you are writing and dealing with 1000+ lines of shell scripts, then experience tells me that you are shooting yourself in the foot. With a gatling gun.

(I used fish, btw. The interactive experience was nice, but the syntax just felt different without much gain, which was frustrating to someone who often writes inline oneliners. Unlearning bash-isms is not a liberty I can afford, as I need to be proficient when I SSH into a machine I do not own or control. I can't force the entire company to install fish on all our lab servers, nor is it okay to install a shell on another persons' dev machine just because we need to coorperate.)

0 comments

sethrin8y ago

Bash is flat-out not a scripting language. It is a command language. It does not support typed variables, or named parameters, or any number of basic scripting language features. It technically does not even provide an 'if' construct. It's sufficient as a 'glue layer', and the Unix toolchain is nice, but it provides next to nothing in the way of abstraction, and that's liable to become a problem closer to the 1000 character mark than 1000 lines.

My rough heuristic is, "no more than ten lines, nor more than two variables." Yes, that's short almost to the point of absurdity. The only good thing that one can say about Bash as a scripting language is that it's better than csh. Bash is taken seriously because of its longevity and ubiquity, but it's fundamentally limited in what it can express, and it is quite trivial to exceed those limitations.

zbentley8y ago

Well, Perl back in the bad old days (4 and earlier) was very similar in terms of limited "minimum standard" facilities, but tons of people used it as a general purpose scripting language and hacked together some really huge systems.

It's definitely easier/safer to write longer programs in other languages, but with proper discipline it's also possible to write really big, robust programs in Bash/shells--people have done this, and some of that code is probably still running today.

Sean17088y ago

> and some of that code is probably still running today.

A lot of that code is still being written today, which is one of the things that OP is trying to change.

arghwhatOP8y ago

This one gets it. What I want from an interactive shell is not what I want from a programming language.

solarkraft8y ago

I wondered this while reading the article. Do we really need the same language for commands & for programming? Portability is nice, but it seems like the requirements are a bit different for one-liners vs scripts you need to maintain.

ComputerGuru8y ago

I agree that the age of 1000+ lines of shell script code should be over; just replace the shebang and use whatever other language you like instead. Unless you need something dependency-free, compilation-free, and portable, then God help you because sh it is.

It's funny though, after decades of shying away from them I've now gone full circle and embraced Makefiles once more. Perhaps it's the cleanness of the bmake/pmake extensions as compared to those of GNU Make, but the determinism, zero dependencies (no CMake or meson and its python baggage), automatic parallelization, and strict error handling call to me each time I have to write mission critical "glue code" that must. just. work.

zbentley8y ago

> dependency-free, compilation-free, and portable, then God help you because sh it is.

That's a fair point. Nothing beats shells for ubiquity and supportedness everywhere . . . in theory.

In practice, even people who write "portable shell scripts" (almost) never really write portable shell scripts.

If you learn the POSIX sh standard like the back of your hand, and stick only to the features in it, never using bashisms or equivalent, most shellscripts still rely on external programs (even if only grep/sed/etc) to do their heavy lifting.

And that's where you get into trouble. Because compared to the variability in behavior of even "standard" ultra-common programs, the variability in behaviors between shell syntaxes/POSIX-vs-non-POSIX shells is tiny. The instant your code invokes an external program, no matter how ubiquitous that program is, you have to worry about:

- People screwing with PATH and changing the program you get.

- People screwing with variables that affect the external program's behavior: LD_LIBRARY_PATH for dynamic executables, or language-specific globals for programs' runtime behavior (CLASSPATH, PERL5LIB, PYTHONPATH, etc.).

- External program "editions" (e.g. non-GNU sed vs GNU sed). Good luck using PCRE with a non-GNU "grep"! Oh, and if you're sticking to POSIX-only shell semantics (no bashisms), you don't even get full BREs in most places you need them in the shell; you're stuck with limited BRE or globbing, which makes editing 100-line PCRE regexes feel like a dream.

- External program per-version differences.

- External program configuration.

- Etc.

Dealing with those issues is where "self-test"/"commandline program feature probe" things like autotools really shine. Raw shellscripts, though, very seldom live up to the "ubiquity" promise.

blt8y ago

I would always pick a make file over a script when it's appropriate - why did you shy away from them?

ComputerGuru8y ago

Probably because of how much misinformation there is about them and how prevalent bad Makefile code is, both in tutorials and in the real world. They're so fundamentally simple - it's just a directed graph - that they're easy to get wrong (mainly because they might still work.. until you need to change something). The majority of Makefiles I encounter will quickly break down under high levels of parallelization due to mistakes in the DAG (try -j64 on random projects from "huge" "high-quality" open source projects). Add to that the fact that most projects use autoconf to generate Makefiles when they don't need to, it was just a mess.

After learning and using many other "modern" Makefile replacements of various complexities (ninja, cmake, scons, tup, meson, bazel, and others) I realized if you're not using exactly what the tool was designed for, you end up recreating a Makefile (e.g.: meson is awesome for cross-platform C++ builds, but if you try to use it to build an environment composed of output from various processes that aren't C++ compilers, writing a Makefile is easier). CMake, apart from also being too C/C++-specific, would be nice if it didn't require a million different files and didn't have such a god awful language.

The only one I liked as a general purpose build system was ninja but it is too restrictive to code in (requires duplication of code by design, not meant to be written by hand though I still do from time to time) and tup, but tup is built around fuse (instead of kqueue/inotify/FindFirstChangeNotification) and so is a no-go for anything serious.

Once I embraced Makefiles, it turned out that most things traditionally built with shell scripts should actually be Makefiles for determinism. For example, I just used bmake to take a clean FreeBSD AMI and turn it into the environment I need to perform some task every n intervals (the task itself was turned to a Makefile rule), in place of where Puppet and other tools would normally have been needed, but would have been overkill for my needs.

The only drawback to Makefiles that I haven't found a clean solution to is when you need to maintain state within a rule (without poisoning the global environment). The only solutions I can see are either a) using a file to store state instead of a variable, which is just stupid, b) calling a separate shell script (with `set -e; set -x` to try and mimic Make behavior) which sort of defeats the point of Make, and c) multiline rules with ;\ everywhere, which is hideous and error-prone but works (though it makes debugging a nightmare as the rule executes as one command, again defeating some of the benefits of Make).

racer-v8y ago

> I quite seriously believe that a 1000 line shell script only exists out of error.

Perhaps the best use case for Oil is to provide a debugging environment where you can figure out what your legacy shell scripts are doing, and rewrite them in another language.

> Unlearning bash-isms is not a liberty I can afford, as I need to be proficient when I SSH into a machine I do not own or control.

This is a slippery slope. I have heard things like, "don't make your own custom aliases / shell functions, because they won't be available when you SSH to another machine." Forcing yourself to always use the lowest common denominator of software is not a fun path.

chubot8y ago

Yes, absolutely. Oil has a principled representation of the interpreter state, so it should be easy to write a debugger.

(Although note that bash actually has a debugger called bashdb, which I didn't know about until recently, and I've never heard of anyone using it.)

One intermediate step I would like to take is to provide a hook to dump the interpreter state on an error (set -e). Sort of like a "stack trace on steroids".

If anyone is running shell cron jobs in the cloud and would like this, please contact me at andy@oilshell.org.

I want to give people a reason to use OSH -- right now there is really no reason to use it, since it is doing things bash already does. But I think a stack trace + interpreter dump in the cloud (like Sentry and all those services) would be a compelling feature. I'm looking for feedback on that.

Shell scripts have a very small amount of in-process state, so it should be easy to dump every variable in the program.

Also some kind of logging hook might be interesting for cloud use cases.

rollcat8y ago

> Forcing yourself to always use the lowest common denominator of software is not a fun path.

Second this. Been there, and back, and there again, and recently back again.

Customize the hell out of your shell, make it your place, make it nice. You're spending your day there, every day. Treat it like you'd treat your work desk.

If it's a stranger's machine, well, OK, suffer through the 10 minutes of troubleshooting, an occasional session with busybox also helps keep "pure POSIX" skillset fresh. If you're becoming a regular there, it's time to think how to make your dotfiles portable. google.com/search?q=dotfiles

jsmthrowaway8y ago

While I appreciate your perspective, I think it’s more from the “occasional SSH to a foreign server” mindset. Scaling “the way I work in a shell” to huge fleets is a nonstarter, and as a working SRE, has been the biggest thing holding me back from zsh, fish, and other alternatives.

OTOH, I’m also against, for example, installing zsh fleetwide in production fleets to accommodate choosy folks. So I’m on both ends of the problem, and know it.

2 more replies

arghwhatOP8y ago

My zsh/nvim/... dotfiles are portable and compatible with FreeBSD, macOS and a bunch of Linux distributions, but it's absolutely unacceptable to run those on a machine I don't control. You don't mess with configuration on another guy(ette)'s server.

My setup is customized, but the core experience is kept relatively untouched. If someone experiences a problem, I need to be able to debug it in place efficiently. I write drivers for the company's hardware for a living—troubleshooting takes more than 10 minutes.

arghwhatOP8y ago

> Perhaps the best use case for Oil is to provide a debugging environment where you can figure out what your legacy shell scripts are doing, and rewrite them in another language.

My approach to that problem is to just instrument scripts to write out the command line every time a tool is called, and infer the logic behind it. When I started here, build/package/test on Windows were controlled by 6000 lines of Perl... A printing version of system() and a few days later, it's 40 lines of batch and 150 lines of Python.

> Forcing yourself to always use the lowest common denominator of software is not a fun path.

Never said it was fun. Also, I need to be able to use the lowest common denominator, I don't necessarily have to be the lowest common denominator. I use a configured zsh and nvim instance myself, but I take care to ensure that if I'm stuck with vi and (ba|da|k|c)sh, I'm still productive. The core behavior of my VIM instance stays close to stock, but it has a bunch of extras like fuzzy search, linting, a nice theme, etc.

And, if I can make a choice without caring, my needs for a shell and a programming language are in direct opposition. I want my shell to put all of its energy into a useful interactive experience.

deadbunny8y ago

> "don't make your own custom aliases / shell functions, because they won't be available when you SSH to another machine."

If you're logging into a machine to the point you're editing files or require your customizations in this day and age you're doing it wrong IMHO.

arghwhatOP8y ago

And what would you recommend instead, then?

racer-v8y ago

Doing what wrong?

Macha8y ago

My personal threshold for upgrading from shell to Python is about "I need an if statement".

jeremyjh8y ago

Yeah I've seen that Python code. Every other line is a call to os.system('...').

sethrin8y ago

You say that like it's a bad thing. Bash does not support niceties like typed variables, named parameters, or classes. Its conditional operator is an external program with a required last argument to make it look like sane syntax. Little things like interpolating variables, array access, or taking command line options are awkward and prone to failure. It doesn't have the insane syntactical issues that csh had, and POSIX is ubiquitous, but that's a pretty low bar and an inadequate justification for using a fundamentally limited language. Python could have a nicer syntax for invoking the system interpreter -- personally, I think that Ruby's backticks feel very Unixy -- but if there is a better interpreter on the system there is no reason not to use it. Even if half your code is calling out to the shell, you still have the benefits of typed variables, a nicer syntax, the ability to abstract code into classes, a variety of nice enumeration options, and all the other benefits of a "real" programming language.

1 more reply

dsnuh8y ago

I've had good luck with this module. Much nicer than using os.system() imo.

https://amoffat.github.io/sh/

1 more reply

rendaw8y ago

Every other line of my code is check_call. Could you expand on the issue you're observing?

1 more reply

mschuster918y ago

> They must be short (<=100 lines), and if their complexity exceeds a certain threshold then a shell script is no longer acceptable regardless of length.

Well, the question is: what do you use as a replacement? For my scripts I tend to pull in some version of PHP as soon as needed. Perl is something that's universally available in anything based on Debian or Ubuntu, but it's a maintenance nightmare, and PHP doesn't care whether I use spaces, tabs or a mixture, or if there are due to any circumstances mixed line endings in a file whereas Python may or may not barf on encountering any of said things...

racer-v8y ago

Anything that supports pipes and signals, and is easy to install. I've started using Steel Bank Common Lisp (SBCL) because it's the default implementation used with the Roswell installer.

https://github.com/roswell/roswell

(As a side-note: Perl 5 was really the ideal language for this, by design; we might be in a different world today if it hadn't been derailed by Perl Forever^W 6).

NoGravitas8y ago

I mean, Perl 5 wasn't actually ideal for this, but it was better suited to the task than anything else that actually exists.

1 more reply

arghwhatOP8y ago

If I was ever to be religious, I would believe that PHP was the work of the devil. It's just so... uuuugh. It doesn't make sense. Did you know that you can increment a string in PHP? Yup. Of course you can't decrement it, that's crazy talk.

Anyway, I usually avoid perl. While definitely a true programming language, I really don't like it. I find that it lacks clarity, and quickly collapses into bash-like hackery.

I personally use Python when it's a notch above shell, but not enough to pull out the big guns.

(I use to write a lot of perl when I was a kid, even made an accounting system in it, so my dislike of perl is from experience. We also had a 6000 line perl build script at my current job which was the last drop. Rewrote it to 40 lines of batch and 150 lines of Python. Likewise, I also wrote a lot of PHP, and can no longer consider it a proper programming language.)

gbacon8y ago

It sounds like the original build script was poorly written. Bad (or hurried) programmers write bad code no matter the language.

Are you claiming that its size was because Perl forced it to be long winded?

1 more reply

kakarot8y ago

If your script is over 80-120 lines, it's time for modularization. No sane person writes 1000 line scripts, and the fact that the Oil author uses such a scenario as a reason to use their shell is very disconcerting about the quality of code and I wouldn't touch Oil with a ten-foot pole.

chubot8y ago

Please read the FAQ:

http://www.oilshell.org/blog/2018/01/28.html#toc_7

I didn't write those scripts, and you rely on them, whether you know it or not.

Do you use Unix? Do you use the cloud? You rely on them. See:

https://www.reddit.com/r/linux/comments/7lsajn/oil_shell_03_...

Also, shell has functions.

kakarot8y ago

I see. My apologies for being so critical. I agree that POSIX is incomplete and out of date.

However I do think it is a disaster that people can't break up their shell code into more modular components, and there is no excuse for it.

And I especially do not wish to discourage you from trying to reinvent shell, because someone has to do it, but at minimum I don't see myself moving away from my shell for the better part of a decade. Hopefully at that point Oil's community has matured to the point where I feel comfortable trusting it to be stable and free of any critical privilege escalation bugs.

Good luck!

kelnos8y ago

If that's the case, then there are a lot of insane people out there. I sympathize with your opinion on how things should be (though I may or may not agree), but at the end of the day we need to look at what the reality of the world is, not on the ideal way we think people should be using the tools we already have.

kakarot8y ago

Well to be fair, there are a lot of insane people out there. :-)

arghwhatOP8y ago

Just because writing 1000+ line bash scripts is something that happens doesn't mean it is something one should not avoid.

People also implement new SQL injection bugs every day—just because a lot of people do it doesn't mean we should just let it slide.

j / k navigate · click thread line to collapse

0 comments

sethrin8y ago

zbentley8y ago

Sean17088y ago

> and some of that code is probably still running today.

A lot of that code is still being written today, which is one of the things that OP is trying to change.

arghwhatOP8y ago

This one gets it. What I want from an interactive shell is not what I want from a programming language.

solarkraft8y ago

ComputerGuru8y ago

zbentley8y ago

> dependency-free, compilation-free, and portable, then God help you because sh it is.

That's a fair point. Nothing beats shells for ubiquity and supportedness everywhere . . . in theory.

In practice, even people who write "portable shell scripts" (almost) never really write portable shell scripts.

- People screwing with PATH and changing the program you get.

- External program per-version differences.

- External program configuration.

- Etc.

Dealing with those issues is where "self-test"/"commandline program feature probe" things like autotools really shine. Raw shellscripts, though, very seldom live up to the "ubiquity" promise.

blt8y ago

I would always pick a make file over a script when it's appropriate - why did you shy away from them?

ComputerGuru8y ago

racer-v8y ago

> I quite seriously believe that a 1000 line shell script only exists out of error.

Perhaps the best use case for Oil is to provide a debugging environment where you can figure out what your legacy shell scripts are doing, and rewrite them in another language.

> Unlearning bash-isms is not a liberty I can afford, as I need to be proficient when I SSH into a machine I do not own or control.

chubot8y ago

Yes, absolutely. Oil has a principled representation of the interpreter state, so it should be easy to write a debugger.

(Although note that bash actually has a debugger called bashdb, which I didn't know about until recently, and I've never heard of anyone using it.)

One intermediate step I would like to take is to provide a hook to dump the interpreter state on an error (set -e). Sort of like a "stack trace on steroids".

If anyone is running shell cron jobs in the cloud and would like this, please contact me at andy@oilshell.org.

Shell scripts have a very small amount of in-process state, so it should be easy to dump every variable in the program.

Also some kind of logging hook might be interesting for cloud use cases.

rollcat8y ago

> Forcing yourself to always use the lowest common denominator of software is not a fun path.

Second this. Been there, and back, and there again, and recently back again.

Customize the hell out of your shell, make it your place, make it nice. You're spending your day there, every day. Treat it like you'd treat your work desk.

jsmthrowaway8y ago

OTOH, I’m also against, for example, installing zsh fleetwide in production fleets to accommodate choosy folks. So I’m on both ends of the problem, and know it.

2 more replies

arghwhatOP8y ago

> Perhaps the best use case for Oil is to provide a debugging environment where you can figure out what your legacy shell scripts are doing, and rewrite them in another language.

> Forcing yourself to always use the lowest common denominator of software is not a fun path.

And, if I can make a choice without caring, my needs for a shell and a programming language are in direct opposition. I want my shell to put all of its energy into a useful interactive experience.

deadbunny8y ago

> "don't make your own custom aliases / shell functions, because they won't be available when you SSH to another machine."

If you're logging into a machine to the point you're editing files or require your customizations in this day and age you're doing it wrong IMHO.

arghwhatOP8y ago

And what would you recommend instead, then?

racer-v8y ago

Doing what wrong?

Macha8y ago

My personal threshold for upgrading from shell to Python is about "I need an if statement".

jeremyjh8y ago

Yeah I've seen that Python code. Every other line is a call to os.system('...').

sethrin8y ago

1 more reply

dsnuh8y ago

I've had good luck with this module. Much nicer than using os.system() imo.

https://amoffat.github.io/sh/

1 more reply

rendaw8y ago

Every other line of my code is check_call. Could you expand on the issue you're observing?

1 more reply

mschuster918y ago

> They must be short (<=100 lines), and if their complexity exceeds a certain threshold then a shell script is no longer acceptable regardless of length.

racer-v8y ago

Anything that supports pipes and signals, and is easy to install. I've started using Steel Bank Common Lisp (SBCL) because it's the default implementation used with the Roswell installer.

https://github.com/roswell/roswell

(As a side-note: Perl 5 was really the ideal language for this, by design; we might be in a different world today if it hadn't been derailed by Perl Forever^W 6).

NoGravitas8y ago

I mean, Perl 5 wasn't actually ideal for this, but it was better suited to the task than anything else that actually exists.

1 more reply

arghwhatOP8y ago

Anyway, I usually avoid perl. While definitely a true programming language, I really don't like it. I find that it lacks clarity, and quickly collapses into bash-like hackery.

I personally use Python when it's a notch above shell, but not enough to pull out the big guns.

gbacon8y ago

It sounds like the original build script was poorly written. Bad (or hurried) programmers write bad code no matter the language.

Are you claiming that its size was because Perl forced it to be long winded?

1 more reply

kakarot8y ago

chubot8y ago

Please read the FAQ:

http://www.oilshell.org/blog/2018/01/28.html#toc_7

I didn't write those scripts, and you rely on them, whether you know it or not.

Do you use Unix? Do you use the cloud? You rely on them. See:

https://www.reddit.com/r/linux/comments/7lsajn/oil_shell_03_...

Also, shell has functions.

kakarot8y ago

I see. My apologies for being so critical. I agree that POSIX is incomplete and out of date.

However I do think it is a disaster that people can't break up their shell code into more modular components, and there is no excuse for it.

Good luck!

kelnos8y ago

kakarot8y ago

Well to be fair, there are a lot of insane people out there. :-)

arghwhatOP8y ago

Just because writing 1000+ line bash scripts is something that happens doesn't mean it is something one should not avoid.

People also implement new SQL injection bugs every day—just because a lot of people do it doesn't mean we should just let it slide.

j / k navigate · click thread line to collapse