Not on macOS. You can do it easily by just invoking /usr/bin/python3, but you’ll get a (short and non-threatening) dialog to install the Xcode Command-Line Developer Tools. For macOS environments, JavaScript for Automation (i.e. JXA, i.e /usr/bin/osascript -l JavaScript) is a better choice.
However, since Sequoia, jq is now installed by default on macOS.
But I didn't want to be bogged down by dependencies.. so I didn't want to go nowhere near Python (and pyenv.. and anaconda.. and then probably having to dockerize that for some reason too..) nor nodeJS nor any of that.
Found a bash shell script to parse TAP written by ESR of all people. That sounds fine, I thought. Most everywhere has bash, and there are no other dependencies.
But it was slow. I mean.. painfully, ridiculously slow. Parsing like 400 lines of TAP took almost a minute.
That's when I did some digging and learned about what awk really is. A scripting language that's baked into the POSIX suite. I had heard vaguely about it beforehand, but never realized it had more power than it's sed/ed/cut/etc brethren in the suite.
So I coded up a TAP parser in awk and it went swimmingly, matched the ESR parser's feature set, and ran millions of lines in less than a second. Score! :D
Highly recommend to everyone - plenty of "batteries" included, like json parser, basic http client and even XML parser, and no venv/conda required. Very good forward compatibility. Fast (compared to bash).
I find that that is one more environment where awk scripting can get the job done, python/perl/php/etc just can't be introduced, bash can _sometimes_ get the job done if it doesn't have to spawn too many subprocesses, and C/other-compiled-options _might_ be able to help if I had some kind of build environment targeting the platform(s) in question and enough patience.
I'll keep an eye out for python with no extra dependency options on the platforms that can handle that though.
That is, one tool to magically identify a file type, one to tokenize it based on that identification, one to correspondingly parse it. All in a streaming/pipe-friendly mode.
Would fit right in, other than the Unix prejudice (nonsensical from Day 0) for LF-separated text records as the “one true format”.
You could argue using the vertical separator | is more syntactically graceful but then it's just a shell argument. There's quite a few radically different shells out there these days like xonsh, murex, and nushell so if simply arranging logic on the screen in a different syntax is what you're looking for then that's probably the way.
Like: printing all but one column somewhere in the middle. It turns into long, long commands that really pull away from the spirit of fast fabrication unix experimentation.
jq and sql both have the same problem :)
$ echo "one two three four five" | awk '{$3="";print}'
one two four fiveWhence perl.
The following command is handy for grepping the output:
cat mydata.json | json_pp cut -d " " -f1-2,4-5 file.txt
where file.txt is: one two three four five
and the return is: one two four fiveThat's why I use Perl instead (besides some short one liners in awk, which in some cases are even shorter than the Perl version) and do my JSON parsing in Perl.
This
diff -rs a/ b/ | ask '/identical/ {print $4}' | xargs rm
is one of my often used awk one liners. Unless some filenames contain e.g. whitespace, then it's Perl again
Yes, shell is definitely too weak to parse JSON!
(One reason I started https://oils.pub is because I saw that bash completion scripts try to parse bash in bash, which is an even worse idea than trying to parse JSON in bash)
I'd argue that Awk is ALSO too weak to parse JSON
The following code assumes that it will be fed valid JSON. It has some basic validation as a function of the parsing and will most likely throw an error if it encounters something strange, but there are no guarantees beyond that.
Yeah I don't like that! If you don't reject invalid input, you're not really parsing
---
OSH and YSH both have JSON built-in, and they have the hierarchical/recursive data structures you need for the common Python/JS-like API:
osh-0.33$ var d = { date: $(date --iso-8601) }
osh-0.33$ json write (d) | tee tmp.txt
{
"date": "2025-06-28"
}
Parse, then pretty print the data structure you got: $ cat tmp.txt | json read (&x)
osh-0.33$ = x
(Dict) {date: '2025-06-28'}
Create a JSON syntax error on purpose: osh-0.33$ sed 's/"/bad/"' tmp.txt | json read (&x)
sed: -e expression #1, char 9: unknown option to `s'
sed 's/"/bad/"' tmp.txt | json read (&x)
^~~~
[ interactive ]:20: json read: Unexpected EOF while parsing JSON (line 1, offset 0-0: '')
(now I see the error message could be better)Another example from wezm yesterday: https://mastodon.decentralised.social/@wezm/1147586026608361...
YSH has JSON natively, but for anyone interested, it would be fun to test out the language by writing a JSON parser in YSH
It's fundamentally more powerful than shell and awk because it has garbage-collected data structures - https://www.oilshell.org/blog/2024/09/gc.html
Also, OSH is now FASTER than bash, in both computation and I/O. This is despite garbage collection, and despite being written in typed Python! I hope to publish a post about these recent improvements
Parsing is a trivial, rejecting invalid input is trivial, the problem is representing the parsed content in a meaningful way.
> bash completion scripts try to parse bash in bash
You're talking about ble.sh, right? I investigated it as well.
I think they made some choices that eventually led to the parser being too complex, largely due to the problem of representing what was parsed.
> Also, OSH is now FASTER than bash, in both computation and I/O.
According to my tests, this is true. Congratulations!
No, the complexity of the parser can be attributed to the incremental parsing. ble.sh implements an incremental parser where one can update only the necessary parts of the previous syntax tree when a part of the command line is modified. I'd probably use the same data structure (but better abstracted using classes) even if I could implement the parser in C or in higher-level languages.
But yes, ble.sh also has a shell parser in shell, although it uses a state machine style that's more principled than bash regex / sed crap.
---
Also, distro build systems like Alpine Linux and others tend to parse shell in shell (or with sed).
They often need package metadata without executing package builds, so they do that by trying to parse shell.
In YSH, you will be able to do that with reflection, basically like Lisp/Python/Ruby, rather than ad hoc parsing.
---
I'm glad to hear you can see the effect of the optimizations ! That took a long time :-)
Some more benchmarks here, which I'll write about: https://oils.pub/release/0.33.0/benchmarks.wwz/osh-runtime/
IMO the real problem is that JSON doesn't work very well at as a because it's core abstraction is objects. It's a pain to deal with in pretty much every statically typed non-object oriented language unless you parse it into native, predefined data structures (think annotated Go structs, Rust, etc.).
Most languages aren't quite that bad. Even if they can't handle JSON very ergonomically, almost every language has at least some concept of nesting objects inside other objects.
What about shell? Just like awk, bash and zsh have a limited number of data types (the same two as awk plus non-associative arrays). So arguably it has the same problem. On the other hand, as you say, in shell it's perfectly idiomatic to use external tools, and jq is one such tool, available on an increasing number of systems. So you may as well store JSON data in your string variables and use jq to access it as needed. Probably won't be any slower than the calls to sed or awk or cut that fill out most shell scripts.
Now, personally, I've gotten into the habit of writing shell scripts with minimal use of external tools. If you stick to shell builtins, your script will run much faster. And both bash and zsh have a pretty decent suite of string manipulation tools, including some regex support, so you often don't actually need sed or awk or cut. However, this also rules out jq, and neither shell has any remotely comparable builtin.
But you might reasonably object that if I care about speed, I would be better off using a real programming language!
I do not use jq. Too complicated for me. Overkill. I created statically-linked program less than half the size of official statically-linked jq that is adequate for own needs. flex is a build requirement for jq.
1. https://www.kernel.org/doc/Documentation/admin-guide/quickly...