Building a Shell (healeycodes.com)

by ingve 39 comments 180 points
Read article View on HN

39 comments

[−] lvales 60d ago
Building a shell is a great exercise, but honestly having to deal with string parsing is such a bother that it robs like 2/3 of the joy along the way. I once built a very simple one in Go [0] as a learning exercise and I stopped once I started getting frustrated with all the corner cases.

[0] https://github.com/lourencovales/codecrafters/blob/master/sh...

[−] chubot 60d ago
A common problem I noticed is that if you took certain courses in computer science, you may have a pre-conceived notion of how to parse programming languages, and the shell language doesn't quite fit that model

I have seen this misconception many times

In Oils, we have some pretty minor elaborations of the standard model, and it makes things a lot easier

How to Parse Shell Like a Programming Language - https://www.oilshell.org/blog/2019/02/07.html

Everything I wrote there still holds, although that post could use some minor updates (and OSH is the most bash-compatible shell, and more POSIX-compatible than /bin/sh on Debian - e.g. https://pages.oils.pub/spec-compat/2025-11-02/renamed-tmp/sp... )

---

To summarize that, I'd say that doing as much work as possible in the lexer, with regular languages and "lexer modes", drastically reduces the complexity of writing a shell parser

And it's not just one parser -- shell actually has 5 to 15 different parsers, depending on how you count

I often show this file to make that point: https://oils.pub/release/0.37.0/pub/src-tree.wwz/_gen/_tmp/m...

(linked from https://oils.pub/release/0.37.0/quality.html)

Fine-grained heterogenous algebraic data types also help. Shells in C tend to use a homogeneous command* and word* kind of representation

https://oils.pub/release/0.37.0/pub/src-tree.wwz/frontend/sy... (~700 lines of type definitions)

[−] healeycodes 60d ago
Author here, and yeah, I agree. I skipped writing a parser altogether and just split on whitespace and | so that I could get to the interesting bits.

For side-projects, I have to ask myself if I'm writing a parser, or if I'm building something else; e.g. for a toy programming language, it's way more fun to start with an AST and play around, and come back to the parser if you really fall in love with it.

[−] ferguess_k 60d ago
Can say the same for control characters in terminals. I even think maybe it's just easier to ditch them all and use QT to build a "terminal" with clickable urls, something similar to what TempleOS does.
[−] emersion 60d ago
Some time ago I've written an article about a particular aspect of shells, job control: https://emersion.fr/blog/2019/job-control/
[−] rrampage 60d ago
Fun read! I built a minimal Linux shell [0] in c and Zig last year which does not depend on libc. It was a great way to learn about execve, the new-ish clone3 syscall and how Linux starts a process. Parsing strings is the least fun part of the building the shell.

[0] https://gist.github.com/rrampage/5046b60ca2d040bcffb49ee38e8...

[−] mzs 60d ago
Had an assignment to build a shell in a week, how hard could it be?

  controlling terminal
  session leader
  job control
The parser was easy in comparison.
[−] ratzkewatzke 60d ago
There's a very good exercise on Codecrafters (https://app.codecrafters.io/courses/shell/overview) to walk you through writing your own shell. I found it enlightening, as well as a good way to learn a new language.
[−] hexer303 60d ago
Unix shells are conceptually simple but hide a surprising amount of complexity under the hood that we take for granted. I recently had build my own PTY controller. There were so many edge-cases to deal with. It took weeks of stress testing and writing many tests to get it right.
[−] lioeters 60d ago
Link was previously posted by author: https://news.ycombinator.com/item?id=47398749 There are other good quality articles on their site, and maybe deserves the imaginary points.
[−] zokier 60d ago
Bit of pedantry but I don't think traditional unix shell (like this) follows repl model; the shell is not usually doing printing of the result of evaluation. Instead the printing happens more as a side effect of the commands.
[−] skydhash 60d ago
It’s a shell, not the whole thing. The whole thing is the shell+kernel+programs.
[−] zokier 60d ago
Even if you view the system as a whole the printing is deeply intertwined with the evaluation, which is very different from repl where eval returns a value and print prints it
[−] jermaustin1 60d ago
I remember my first shell programming I ever did was batch in windows back in the 3.11/95 days.

The first line was always to turn off echo, and I've always wondered why that was a decision for batch script. Or I'm misremembering. 30 years of separation makes it hard to remember the details.

[−] enoint 60d ago
Echo in that case prints command lines before executing them. Its analog is set -x rather than echo.
[−] teo_zero 59d ago

> the shell is not usually doing printing of the result of evaluation

I always include $? in the prompt, so I guess I can say it does print the result of the evaluation.

[−] themafia 60d ago
It prints a prompt.
[−] zokier 60d ago
That's not what print in repl means.
[−] lasgawe 60d ago
Great article. There are many things every developer should do when starting to learn programming or when trying to improve their skills. This is one of them. I once built a shell-like programming language (not an interpreter). If anyone reading this wants to improve their skills, I strongly suggest building your own shell from scratch.
[−] doe88 60d ago
Is there a (real) shell whose code is relatively short and self contained and would be valuable to read? This was always something I wanted to do but never quite spent time to look for a good one to explore.
[−] austy69 60d ago
Fun read. Wonder if you are able to edit text in the shell, or if you need to implement a gap buffer to allow it?
[−] dirk94018 60d ago
Interesting. I wanted to do toast | bash to let the AI drive the computer but the bash shell really got in the way. Too much complexity. The things that annoy humans, $ expansion, special characters, etc don't work for AI either. Ended up writing a custom shell for AI (and humans). When a tool gets in the way, sometimes it just time to change the tool.
[−] rigorclaw 60d ago
[flagged]
[−] wei03288 60d ago
[dead]
[−] leontloveless 60d ago
[dead]
[−] stainlu 60d ago
[dead]
[−] Heer_J 59d ago
[dead]
[−] hristian 60d ago
[flagged]