undefined | Better HN

0 pointsgorgoiler5y ago0 comments

To criticize sh semantics without acknowledging that C was always there when you needed something serious is a bit short sighted.

There are two uses of the Unix “api”:

[A] Long lived tools for other people to use.

[B] Short lived tools one throws together oneself.

The fact that most things work most of the time is why the shell works so well for B, and why it is indeed a poor choice for the sort of stable tools designed for others to use, in A.

The ubiquity of the C APIs of course solved [A] use cases in the past, when it was unconscionable to operate a system without cc(1). It’s part of why they get first class treatment in the Unix man pages, as old fashioned as that seems nowadays.

0 comments

jcranmer5y ago

There's a certain irony in responding to criticism of that which you're extolling by saying not to use it.

And the only reason I might be pushed down that path is because the task I'm working on happens to involve filenames with spaces in them (without those spaces, the code would work fine!), because spaces are a reasonable thing to put in a filename unless you're on a Unix system.

enriquto5y ago

> because spaces are a reasonable thing to put in a filename unless you're on a Unix system.

Putting spaces on a filename is atrocious and should be disallowed by modern filesystems. It is like if you could put spaces inside variable names in Python. Ridiculous.

necrotic_comp5y ago

At Goldman, the internal language Slang has spaces in variable names. It's insane at first glance and I got into an argument with my manager about why in the world this was acceptable, and he could give me no other answer than "this is the way it is".

But when you realize that space isn't a valid separator between token, seeing things like "Class to Pull Info From Database::Extractor Tool" actually becomes much easier to read and the language becomes highly expressive, helped somewhat by the insane integration into the firm's systems.

I was on your side until I tried it, but it can actually be quite useful, esp. if everything is consistent.

1 more reply

HeadsUpHigh5y ago

I'm not going back to CamelCase or underscores for my normal day to day file naming. The problem with spaces only exists inside the IT world and it's something they should find a way around.

2 more replies

outadoc5y ago

You should take a stroll in the real world sometime, where spaces and Unicode exists :)

2 more replies

nerdponx5y ago

I disagree with this immensely.

Also, some languages do allow whitespace in variable names like R and SQL, so long as the variable names are quoted or escaped properly.

tomaskafka5y ago

Are your kids named NameSurname? Did you meet EuropeanBrownBear on your last trip to the mountains? :)

coliveira5y ago

The shell works with spaces, you just need to be careful to quote every file name. The advantage of using file names without spaces is that you can avoid the quotes.

viraptor5y ago

Please don't ever assume you can avoid spaces and quotes. It's just a time bomb.

It's like people saying you don't need to escape SQL values because they come from constants. Yes, they do... today.

It's not just quoting either. It's setting the separator value and reverting it correctly. It's making sure you're still correct when you're in a function in an undefined state. It's a lot of overhead in larger projects.

2 more replies

tsimionescu5y ago

The shell (bash, anyway) has a weird mix of support and lack of support for spaces/lists/etc.

For example, there is no way to store the `a b "c d"` in a regular variable in a way where you can then call something similar to `ls $VAR` and get the equivalent of `ls a b "c d"`. You can either get the behavior of `ls a b c d` or `ls "a b c d"`, but if you need `ls a b "c d"` you must go for an array variable with new syntax. This isn't necessarily a big hurdle, but it indicates that the concepts are hard to grasp and possibly inconsistent.

2 more replies

gorgoilerOP5y ago

Indeed, and the shell lives and breathes spaces. Arrays of arguments are created every time one types a command. Without spaces-as-magic we’d be typing things like this all the time:

  $ exec(‘ls’, ‘-l’, ‘A B C’)

Maybe that’s unrealistic? I mean, if the shell was like that, it probably wouldn’t have exec semantics and would be more like this with direct function calls:

  $ ls(ls::LONG, ‘A B C’)

Maybe we would drop the parentheses though — they can be reasonably implied given the first token is an unspaced identifier:

  $ ls ls::LONG, ‘A B C’

And really, given that unquoted identifiers don’t have spaces, we don’t really need the commas either. Could also use ‘-‘ instead of ‘ls::’ to indicate that an identifier is to be interpreted locally in the specific context of the function we are calling, rather than as a generic argument.

  $ ls -LONG ‘A B C’

If arguments didn’t have spaces, you could make the quotes optional too.

QED

1 more reply

thekaleb5y ago

Some shells handle this quoting better than others.

kbenson5y ago

That's when it's useful to use Perl as a substitute for a portion of or the entire pipeline.

de_watcher5y ago

Choosing space as a separator is a useful convention that saves you typing in additional code afterwards if you want to run a short command.

It's like there is a shortcut that some people use that you want to wall in because it doesn't look pretty to you.

gorgoilerOP5y ago

The biggest pain comes from untarring some source code and trying to build it while in ~/My Tools Directory. Spaces in directories that scupper build code that isn’t expecting them is the fatal mixing of worlds.

In most other cases I’ve never really had a problem with “this is a place where spaces are ok” (e.g. notes, documents, photos) and “this is a place where they are not ok” — usually in parts of my filesystem where I’m developing code.

It’s fine to make simplifying assumptions if it’s your own code. Command history aside, most one liners we type at the shell are literally throwaways.

I think I was clear that they aren’t the only category of program one writes and that, traditionally on Unix systems, the counterpart to sh was C.

tsimionescu5y ago

The problem is that the pipeline model is extremely fragile and breaks in unexpected ways in unexpected places when hit with the real world.

The need to handle spaces and quotes can take you from a 20 character pipeline to a 10 line script, or a C program. That is not a good model whichever way you look at it.

mbreese5y ago

Pipelines are mainly for short-lived, one-off quick scripts. But they are also really useful for when you control the input data. For example, if you need a summary report based on the results of a sql query.

If you control the inputs and you need to support quotes, spaces, non-white space delimiters, etc... in shell script, then that’s on you.

If you don’t control the inputs, then shell scripts are generally a poor match. For example, if you need summary reports from a client, but they sometimes provide the table in xlxs or csv format — shell might not be a good idea.

Might be controversial, but I think you can tell who works with shell pipes the most by looking at who uses CSV vs tab-delimited text files. Tabs can still be a pain if you have spaces in data. But if you mix shell scripts with CSV, you’re just asking for trouble.

diegocg5y ago

That's not the fault of pipes, that's the fault of shells like bash. There are shells that deal with spaces and quotes perfectly fine.

necrotic_comp5y ago

I've been scripting stuff on the pipeline for over a decade and haven't really run into this much.

You can define a field separator in the command line with the environment variable IFS - i.e. 'IFS=$(echo -en "\n\b");' for newlines - which takes care of the basic cases like spaces in filenames/directory names when doing a for loop, and if I have other highly structured data that is heavily quoted or has some other sort of structure to it, then I either normalize it in some fashion or, as you suggest, write a perl script.

I haven't found it too much of a burden, even when dealing with exceptionally large files.

xkucf035y ago

I agree that spaces, quotes, encodings, line-ends etc. are problem, but this is not problem of „the pipeline model“ – this is problem of ambiguously-structured data. See https://relational-pipes.globalcode.info/v_0/classic-example...

nerdponx5y ago

Kind of / not really.

Also Zsh solves 99% of my "pipelines are annoying" and "shell scripts are annoying" problems.

Even on systems where I can't set my default shell to Zsh, I either use it anyway inside Tmux, or I just use it for scripting. I suppose I use Zsh the way most people use Perl.

nerdponx5y ago

C was always there when you needed something serious

There is a world of stuff in between "I need relatively low-level memory management" and "I need a script to just glue some shit together".

For that we have Python and Perl and Ruby and Go, or even Rust.

gorgoilerOP5y ago

Yes! I love 2020!

(My point about C was in the historical context of Unix, which is relevant when talking about its design principles.)

1MachineElf5y ago

Going with the same theme, C itself was an innovation meant to fill the space between Assembler (in this context, "A") and B[0].

[0] https://en.wikipedia.org/wiki/B_(programming_language)

klyrs5y ago

That isn't really supported by the article you linked -- it describes C as an extension of B multiple times.

2 more replies

viraptor5y ago

This doesn't always hold in Linux though. Some proc entries are available only via text entries you have to parse whether that's in sh or C. There's simply no public structured interface.

JdeBP5y ago

This is one of the superior aspects of FreeBSD: no parsing of human-readable strings to get back machine-readable information about the process table. It's available directly in machine-readable form via sysctl().

pwdisswordfish25y ago

[C] Long lived tools one throws together oneself

Stable tools designed for oneself

Koshkin5y ago

The unknown knowns?

j / k navigate · click thread line to collapse

0 comments

jcranmer5y ago

There's a certain irony in responding to criticism of that which you're extolling by saying not to use it.

enriquto5y ago

> because spaces are a reasonable thing to put in a filename unless you're on a Unix system.

Putting spaces on a filename is atrocious and should be disallowed by modern filesystems. It is like if you could put spaces inside variable names in Python. Ridiculous.

necrotic_comp5y ago

I was on your side until I tried it, but it can actually be quite useful, esp. if everything is consistent.

1 more reply

HeadsUpHigh5y ago

I'm not going back to CamelCase or underscores for my normal day to day file naming. The problem with spaces only exists inside the IT world and it's something they should find a way around.

2 more replies

outadoc5y ago

You should take a stroll in the real world sometime, where spaces and Unicode exists :)

2 more replies

nerdponx5y ago

I disagree with this immensely.

Also, some languages do allow whitespace in variable names like R and SQL, so long as the variable names are quoted or escaped properly.

tomaskafka5y ago

Are your kids named NameSurname? Did you meet EuropeanBrownBear on your last trip to the mountains? :)

coliveira5y ago

The shell works with spaces, you just need to be careful to quote every file name. The advantage of using file names without spaces is that you can avoid the quotes.

viraptor5y ago

Please don't ever assume you can avoid spaces and quotes. It's just a time bomb.

It's like people saying you don't need to escape SQL values because they come from constants. Yes, they do... today.

2 more replies

tsimionescu5y ago

The shell (bash, anyway) has a weird mix of support and lack of support for spaces/lists/etc.

2 more replies

gorgoilerOP5y ago

Indeed, and the shell lives and breathes spaces. Arrays of arguments are created every time one types a command. Without spaces-as-magic we’d be typing things like this all the time:

  $ exec(‘ls’, ‘-l’, ‘A B C’)

Maybe that’s unrealistic? I mean, if the shell was like that, it probably wouldn’t have exec semantics and would be more like this with direct function calls:

  $ ls(ls::LONG, ‘A B C’)

Maybe we would drop the parentheses though — they can be reasonably implied given the first token is an unspaced identifier:

  $ ls ls::LONG, ‘A B C’

  $ ls -LONG ‘A B C’

If arguments didn’t have spaces, you could make the quotes optional too.

QED

1 more reply

thekaleb5y ago

Some shells handle this quoting better than others.

kbenson5y ago

That's when it's useful to use Perl as a substitute for a portion of or the entire pipeline.

de_watcher5y ago

Choosing space as a separator is a useful convention that saves you typing in additional code afterwards if you want to run a short command.

It's like there is a shortcut that some people use that you want to wall in because it doesn't look pretty to you.

gorgoilerOP5y ago

It’s fine to make simplifying assumptions if it’s your own code. Command history aside, most one liners we type at the shell are literally throwaways.

I think I was clear that they aren’t the only category of program one writes and that, traditionally on Unix systems, the counterpart to sh was C.

tsimionescu5y ago

The problem is that the pipeline model is extremely fragile and breaks in unexpected ways in unexpected places when hit with the real world.

The need to handle spaces and quotes can take you from a 20 character pipeline to a 10 line script, or a C program. That is not a good model whichever way you look at it.

mbreese5y ago

If you control the inputs and you need to support quotes, spaces, non-white space delimiters, etc... in shell script, then that’s on you.

diegocg5y ago

That's not the fault of pipes, that's the fault of shells like bash. There are shells that deal with spaces and quotes perfectly fine.

necrotic_comp5y ago

I've been scripting stuff on the pipeline for over a decade and haven't really run into this much.

I haven't found it too much of a burden, even when dealing with exceptionally large files.

xkucf035y ago

nerdponx5y ago

Kind of / not really.

Also Zsh solves 99% of my "pipelines are annoying" and "shell scripts are annoying" problems.

Even on systems where I can't set my default shell to Zsh, I either use it anyway inside Tmux, or I just use it for scripting. I suppose I use Zsh the way most people use Perl.

nerdponx5y ago

C was always there when you needed something serious

There is a world of stuff in between "I need relatively low-level memory management" and "I need a script to just glue some shit together".

For that we have Python and Perl and Ruby and Go, or even Rust.

gorgoilerOP5y ago

Yes! I love 2020!

(My point about C was in the historical context of Unix, which is relevant when talking about its design principles.)

1MachineElf5y ago

Going with the same theme, C itself was an innovation meant to fill the space between Assembler (in this context, "A") and B[0].

[0] https://en.wikipedia.org/wiki/B_(programming_language)

klyrs5y ago

That isn't really supported by the article you linked -- it describes C as an extension of B multiple times.

2 more replies

viraptor5y ago

This doesn't always hold in Linux though. Some proc entries are available only via text entries you have to parse whether that's in sh or C. There's simply no public structured interface.

JdeBP5y ago

pwdisswordfish25y ago

[C] Long lived tools one throws together oneself

Stable tools designed for oneself

Koshkin5y ago

The unknown knowns?

j / k navigate · click thread line to collapse