That's the way to go. I don't even consider other shallow and ad-hoc approaches as actually parsing it.
I've been working on a state-machine based parser of my own. It's hard, I'm targetting very barebones interpreters such as posh and dash. Here's what it looks like
https://gist.github.com/alganet/23df53c567b8a0bf959ecbc7b689...
(not fully working example, but it gives an idea of what pure POSIX shell parsing looks like, ignore the aliases, they'll not be in the final version).
> I'm glad to hear you can see the effect of the optimizations ! That took a long time :-)
Yep, been testing osh since 0.9! Still a long way to go to catch up with ksh93 though, it's the fastest of all shells (even dash) by a wide margin.
By beating bash, you also have beaten zsh (it's one of the slowest shells around).