Is there a linter to ensure scripts are portable across shells? I try to write them like that but I'm certainly no master so I write them to work with busybox.
Why not POSIX or some common external tools where it makes sense? Most of those big switch statements could be easily replaced with some standard programs that already exist everywhere.
One main reason is performance. Forking for other tools is very expensive.
That said, using larger sed or awk programs instead of ad-hoc calls for small snippets would perhaps be net-positive for performance and readability.
I'm currently working on very strict bootstrap scenarios in which sed and awk might not be available, but a shell might be (if I'm able to write it). It is possible that in such scenarios, the fist send and awk versions will be shell-written polyfills anyway.
Why not just use gcc which already exists everywhere?
When you answer that, same answer. If you can't imagine any answer for that, then the answer won't be convincing or make sense even if anyone tried to articulate it. Which is fine. Everyone doesn't have to find meaning in the same things.
Shell without a userland is like FORTH without the ability to define new words. It's really contrary to the whole idea of what a shell is. Bootstrapping in very constrained conditions makes some sense, but where would you have a POSIX shell and not a POSIX userland (or close equivalent) to work with? When I wrote a similar compiler in shell, I purposely offloaded everything I could to external tools and used the shell for composition, so I found the approach intriguing and wanted to ask. I wasn't trying to criticize or dismiss the project, I think it's really cool or else I wouldn't have bothered to read the code in the first place.
gcc exists essentially everywhere a shell exists too. If you're ok with using grep and bc or whatever, then why not gcc?
Or better yet, awk? awk is as old and ubiquitous as sh itself, on every machine even ancient ones that don't even have a compiler because that was a paid extra. Only unlike sh it's actually a full normal programming language that can do basically everything the shell can do only in a far more readable and sane way instead of using wierd expansions and abusing the command line parser to achieve functions it doesn't have overt functions for. Just write directly in awk the same way you would in say python or js. If you have sh, especially if you also have the userland you are talking about, then you have awk. It's part of that userland in a way that gcc is admittedly not.
More in your vein actually, when I do things like this I pick yet a different ideal goal than either you or the author. I avoid all externals (and even child shells) to whatever extent possible, but I do use bash for all its' worth. Every possible intentional or hack bashism. Require bash, but leverage bash to within an inch of it's life and require nothing else.
But this project tageting more portable code that doesn't require bash is really cool and valuable. Even though it's not a standard I personally shoot for even when I am specifically shell-golfing.
There are probably as many different points along the spectrum to draw the line as there are individual developers, each with some actually reasonbable argument to justify that particular place to draw the line.
To me using grep and sed and tr and ls and cat etc etc when I don't need them is just unsatisfying, inelegant, uninteresting.
If you are in bash or ksh93 or zsh, you don't need all kinds of things like basename, dirname, cut, tr, wc, nor some of the more powerful stuff either most of the time. I have a shell function that uses the built-in read combined with a named pipe file created in tmp to make a sleep that doesn't need /bin/sleep. Why bother? because it's awesome. And usually the only times I need to use sleep it's in some rapid short duration polling loop that really is better if you don't have to fork & exec & teardown on every iteration. It's bad enough to be polling like that in the first place. And it just doesn't matter how "probably all the externals will be there", not using them is even better. And these days a lot of once-common "userland" is no longer common or installed by default. A script that never tries to run dos2unix never cares that it's not installed, or that the bsd version behaves differently, or the mac version is stupid old, etc.
Would be a lot better if it came with tests. Please do this justice and dont let it rot as a gist, make a real repo and add some docs and at least smoke tests or some kind. Thanks
This gist is a concatenation of several shell script modules which form a comprehensive parser library for the portable shell.
The main parser and emitter are BFN-generated (that's why they look so mechanical). The BNF parser generator is also written in portable shell (I posted another gist with a preview of it in another thread).
All modules have comprehensive tests, but it is still lacking documentation and not ready for prime time!
In the classic FLOSS tradition, it would be cool if you might still consider publishing such a "not-ready" repository - some people may (or may not!) be still interested, and also (sorry!) there's the bus factor... But on the other hand, in the classic FLOSS tradition, it's also 100% your decision and you have the full right to do any way you like!
I love this as a novelty, and it could be useful for bootstrapping a system that’s had a shell cross-compiled to it.
Thinking about this in the context of a job I used to do, security on shared hosting environments, it gives me a bit of a shiver. There are reasons compilers aren’t available to normal users on those.
I mean, today it's possible to generate it in Tcl, Elisp, Windows BAT, Powershell.
The effort is just 1 prompt.
The WHY question is much more important today -- "because I can" no longer makes sense, because we all can do much, much more with minimum effort today than before LLMs.
59 comments
Usage:
printf 'int main(){puts("hello");return 0;}' | sh c89cc.sh > hello
chmod +x hello
./hello
If you want something one would actually use, try my project tuish:
http://github.com/alganet/tuish
You can use what I use: https://github.com/alganet/shell-versions
It's a container with lots of shells that you can test. Like esvu but for the shell.
Might have a little outdated docs, hit me with an issue if you use it and face any problems (I'm also the author).
That said, using larger sed or awk programs instead of ad-hoc calls for small snippets would perhaps be net-positive for performance and readability.
I'm currently working on very strict bootstrap scenarios in which sed and awk might not be available, but a shell might be (if I'm able to write it). It is possible that in such scenarios, the fist send and awk versions will be shell-written polyfills anyway.
> One main reason is performance
This assumes the executed program is as fast or slower than the caller.
When you answer that, same answer. If you can't imagine any answer for that, then the answer won't be convincing or make sense even if anyone tried to articulate it. Which is fine. Everyone doesn't have to find meaning in the same things.
Or better yet, awk? awk is as old and ubiquitous as sh itself, on every machine even ancient ones that don't even have a compiler because that was a paid extra. Only unlike sh it's actually a full normal programming language that can do basically everything the shell can do only in a far more readable and sane way instead of using wierd expansions and abusing the command line parser to achieve functions it doesn't have overt functions for. Just write directly in awk the same way you would in say python or js. If you have sh, especially if you also have the userland you are talking about, then you have awk. It's part of that userland in a way that gcc is admittedly not.
More in your vein actually, when I do things like this I pick yet a different ideal goal than either you or the author. I avoid all externals (and even child shells) to whatever extent possible, but I do use bash for all its' worth. Every possible intentional or hack bashism. Require bash, but leverage bash to within an inch of it's life and require nothing else.
But this project tageting more portable code that doesn't require bash is really cool and valuable. Even though it's not a standard I personally shoot for even when I am specifically shell-golfing.
There are probably as many different points along the spectrum to draw the line as there are individual developers, each with some actually reasonbable argument to justify that particular place to draw the line.
To me using grep and sed and tr and ls and cat etc etc when I don't need them is just unsatisfying, inelegant, uninteresting.
If you are in bash or ksh93 or zsh, you don't need all kinds of things like basename, dirname, cut, tr, wc, nor some of the more powerful stuff either most of the time. I have a shell function that uses the built-in read combined with a named pipe file created in tmp to make a sleep that doesn't need /bin/sleep. Why bother? because it's awesome. And usually the only times I need to use sleep it's in some rapid short duration polling loop that really is better if you don't have to fork & exec & teardown on every iteration. It's bad enough to be polling like that in the first place. And it just doesn't matter how "probably all the externals will be there", not using them is even better. And these days a lot of once-common "userland" is no longer common or installed by default. A script that never tries to run dos2unix never cares that it's not installed, or that the bsd version behaves differently, or the mac version is stupid old, etc.
The main parser and emitter are BFN-generated (that's why they look so mechanical). The BNF parser generator is also written in portable shell (I posted another gist with a preview of it in another thread).
All modules have comprehensive tests, but it is still lacking documentation and not ready for prime time!
shto be portable?Thinking about this in the context of a job I used to do, security on shared hosting environments, it gives me a bit of a shiver. There are reasons compilers aren’t available to normal users on those.
I mean, today it's possible to generate it in Tcl, Elisp, Windows BAT, Powershell.
The effort is just 1 prompt.
The WHY question is much more important today -- "because I can" no longer makes sense, because we all can do much, much more with minimum effort today than before LLMs.