Linux is an interpreter (astrid.tech)

by frizlab 55 comments 241 points
Read article View on HN

55 comments

[−] tosti 48d ago
This article was painful to read because of all the misconceptions. A cpio archive is not a filesystem. Author uses initramfs, which is based on tmpfs. Linux can extract cpio to tmpfs. An archive of files and directories is in itself not a program.

Just because something looks similar doesn't mean it's equivalent. Binary programs are executed on the CPU, so if there's an interpreter involed it's hiding in the hardware environment. That's outside the scope of an OS kernel.

If you have a shell script in your filesystem and run it, you need to also provide the shell that interprets the script. Author omits this detail and confuses the kernel with the shell program.

Linux can easily be compiled without support for initramfs and ramdisk. It can still boot and run whatever userland sits in the filesystem.

"Linux initrd interpreter" hurts my brain. That's not how it works.

Edit: should've read further. Still a backwards way of explaining things imho.

[−] jeffbee 48d ago
Binary programs are executed on the CPU but the program file is an archive with sections, and only one of them is the program, usually, while the others are all metadata. The CPU isn't capable of understanding the program file at all. Linux has to establish the conditions under which the program runs, that means at a minimum establishing the address space in which the program counter lies then jumping to that address. The instructions for how to do that are in the metadata sections of the ELF executable.
[−] saidnooneever 48d ago
not too bad explain, though the 'usually' might be clarified that an ELF file 'can have sections marked as executable' (tho ofc i get not wanting to get into segment flags :p) and also a program is cobbeled together potentially from many of these ELF files. in most cases the single file is useless. (most cases as in binaries provided by a standard linux distro, now 'producible binaries')
[−] Hasslequest 48d ago
It's the init in the cpio which is the interpeted program, and the rest of the cpio is memory for this interpeted progam.
[−] tremon 48d ago
How is it interpreted? Something that you load into memory and then set the processor's Instruction Pointer at is not interpreted at all. And in case /init is a shell script, it's not the kernel doing the interpreting -- the interpreter would be /bin/sh, which would still be loaded into memory and executed by the processor. Claiming that machine code is "interpreted" because it still needs to be finalized by a loader is not a clever gotcha -- it's ignorant erasure of relevant distinctions.
[−] commandersaki 48d ago
At least it isn't AI slop!
[−] daveguy 48d ago
I dunno, sure seems like "AI research" at least.
[−] saidnooneever 48d ago
might be, i wouldnt be sure but i would encourage author to dive a lil deeper based on some comments in the thread. Theres obviously lots to explore. Anything that goes into this topic is likely to make omissions, have some ,wrong viewing angle' or such things.

its not a phd paper so thats fine!

[−] astralbijection 48d ago

> An archive of files and directories is in itself not a program.

Okay, but you can make the same argument to say that ELF files aren't programs in and of themselves either. In fact, some ELF files are dynamic libraries without an entrypoint, and therefore not actually executable in any meaningful way unless connected to yet another program.

If you can accept that some ELF files are executables and some aren't, then you can also accept that some CPIOs are executables and some aren't. What's the difference between ld.so unpacking an ELF file into RAM and running its entrypoint, and the Linux kernel unpacking an initramfs into RAM and running its entrypoint?

[−] tosti 48d ago
By that logic, everything is executable. Not entirely wrong, but mostly because of vulnerabilities. Not because of a highly contrived way of using a file format to run a program. You could do the same thing with json or xml.
[−] saidnooneever 48d ago
i agree, its linkers and loaders which parse a file and extract meaning, which might be executed by yet another program that runs in kernel and creates a 'user space' part to run the untrusted code in.

its not as simple as executable file, blap now it runs. they made it like that on the surface so people dont need to bother with the details.

a lot in this article, writes about the abstractions and maybe how they work. not really sure as you i found it hard to read. It doesnt mean its all wrong though, maybe theres more ways to look at a system which has layered abstractions. Each layer can be a different view and be discussed independently in its design.

if you look at what the CPU and kernel code is doing its a messy afair :D hard to talk about (fun tho :D and imho good to understand as you pointed out)

[−] saidnooneever 48d ago
ELF files are an image format which is a format to store things in basically..image being a legacy term.

an elf file is not executable, but depending on what you do, a linker and loader might cause the operating system to execute some or all of its contents at some point.

[−] vatsachak 48d ago
Isn't every OS an interpreter for machine code with kernel privileges?
[−] BenjiWiebe 48d ago
No. The OS's software doesn't individually read each instruction and decide what to do with it.

It passes it off to the hardware (CPU) which runs the instructions.

[−] lolsowrong 48d ago
Most of the time. But sometimes, no. See ATL thunk emulation (last I checked, still alive in the windows kernel) and ntvdm handling of the BOP pseudoinstruction.

See also: Jazelle DBX.

Hell, on modern x86 processors, many “native” instructions are actually a series of micro-ops for a mostly undocumented and mostly poorly understood microcode architecture that differs from the natively documented instruction set.

It’s turtles all the way down.

[−] fc417fc802 48d ago
Aren't all of them microcoded? Some years back root was achieved on a line of intel processors and new instructions implemented as proof of concept. There's an academic paper, citation not immediately to hand.
[−] trynumber9 48d ago
Some instructions are microcoded but others take the fast path and avoid the microcode sequencer. Can't patch the latter in microcode RAM.
[−] lolsowrong 48d ago
I saw the paper from Google last year and thought something in it aligned with not everything running through the microcode engine, though I could be wrong.
[−] fc417fc802 48d ago
Might well be the case. I don't think I'm familiar with the paper you're referring to; any chance of at least a vague description?
[−] lolsowrong 48d ago
Can’t find the pdf, but it’s all related to the zentool stuff:

https://github.com/google/security-research/blob/master/pocs...

Tavis spells it out there pretty quickly:

“ The simplest instructions (add, sub, mov, etc) are all implemented in hardware. The more complicated instructions like rdrand, fpatan and cmpxchg are microcoded. You can think of them as a bit like calling into a library of functions written in that RISC-like code.”

[−] saagarjha 48d ago
Jazelle and micro-ops are not interpreters, they are executed in hardware.
[−] lolsowrong 48d ago
I believe only some parts of Jazelle are handled in hardware, though I don’t know if anybody has got their hands on any of the bits of the software side. I do know there’s documentation on handling unimplemented instructions.

I don’t know how I feel about micro-ops being executed in hardware - I mostly agree, but also, microcode updates exist…

[−] fc417fc802 48d ago
An interpreter implemented in hardware is still an interpreter. Hot take, all machine instruction sets are scripting languages and LLVM is a transpiler.
[−] saidnooneever 48d ago
OS is an interface that allowes to uses system resources. it is usually a collection of software and interfaces that do this these days because system resources are complex especially to use securely. the cpu interprets machine code.. the OS might tell the CPU what to execute (might, depend on design).
[−] astralbijection 48d ago
This one is an interpreter for CPIO files.
[−] djmips 48d ago
Everything is an interpreter?
[−] tsoukase 48d ago
Yes, except compilers.
[−] tnwhitwell 48d ago
Aren’t they the most interpretery of interpreters? Just the languages aren’t Spanish and English, but C and machine code
[−] imtringued 48d ago
They're ahead of time interpreters.
[−] qubex 48d ago
Turing’s Theta Combinator
[−] ogghostjelly 47d ago
How is Turing's theta combinator related to the article? I'm not very familiar with functional programming concepts.
[−] qubex 47d ago
Θ()= (Θ)
[−] lstodd 48d ago
man ld.so:

... (in which case no command-line options to the dynamic linker can be passed and, in the ELF case, the dynamic linker which is stored in the .interp section of the program is executed)

note how the ELF section is named.

[−] novachen 48d ago
[dead]
[−] pugchat 48d ago
[dead]
[−] ryguz 48d ago
[dead]
[−] kevinbaiv 48d ago
[flagged]
[−] ssyhape 48d ago
[flagged]
[−] unit149 48d ago
[dead]
[−] TZubiri 48d ago
From earlier in the series.

"Okay, so the reason I initially did this was because I didn’t want to pay Contabo an extra $1.50/mo to have object storage just to be able to spawn VPSes from premade disk images."

I think there's a sweetspot between " I spent 50 hours to save 1.50$/mo" and "every engineer should be spending 250K$/mo in tokens".

Host employees still need to eat, if we can't afford 1.50$/mo, then we aren't really professionals and are just coasting on real infrastructure subsidized by professionals that pay for the pay-as-you-go infrastructure.

It's still possible to go even further to these extremes, there's thousands of developers that just coast by on github pages and vercel subdomains. So at least having a VPS puts you ahead of that mass competitively, but trying to save 1.50$/mo is a harsh place to be. At that point I don't think that the technical skills are the bottleneck, it's more likely that there's some social work that needs to be done, and that obsessing over running doom on curl is not a very productive use of one's time in a critical economic spot.

I write this because I am in that spot, but perhaps I'm reading a bit much into it.

[−] PhilipRoman 48d ago
That sounds like something I would've done... When I was a kid, the 5€/month for a VPS was a massive expense, to the point where I occasionally had to download my 10GB rootfs to my mom's windows laptop, terminate the instance and then rebuild it once I had enough money. Eventually I got an old Kindle that was able to run an app called Terminal IDE which had a Linux shell with some basic programs like busybox, gcc. Spartacus Rex, if you're out there, thank you for making my entire career possible.
[−] bityard 48d ago
The author did write that, yes. But it's very obviously a joke. The real reasons are literally the very next paragraph:

> I thought it was a neat trick, a funny shitpost that riffs on the eternal curl | sh debate. I could write a blog post about it, I tell you about how you can do it yourself, one thousand words, I learn something, you learn something, I get internet points, win win.

[−] cardanome 48d ago

> it's more likely that there's some social work that needs to be done, and that obsessing over running doom on curl is not a very productive use of one's time in a critical economic spot.

It can be a problem but it can be also just a human following their special interests that give them joy.

For me as a ADHD person engaging with my special interests is a hard requirement to keep my mental health in check and therefore a very good use of my time.

[−] jmalicki 48d ago
I like the term host employee, carrying the LLM parasite as it uses us to embody itself and reproduce into the singularity.
[−] pwdisswordfishy 48d ago

> if we can't afford 1.50$/mo, then we aren't really professionals and are just coasting on real infrastructure subsidized by professionals

This is a strange claim.

Whether someone is getting paid or not to do something is what determines who is a professional, not whether or how much they're paying someone else. (And that's the only thing that matters, unlike the way that "professional" is used as a euphemism in Americans' bizarre discursive repertoire.)

[−] astralbijection 48d ago
... I think you're reading a bit much into it. It's less that I couldn't afford to pay that, and more that I didn't want to pay that, and iterating on the solution I used to dodge that led me down a giant rabbit hole of learning more about Linux while solving stupider and stupider problems posed for myself.
[−] hrmtst93837 48d ago
[flagged]
[−] Roshan_Roy 48d ago
I think the article works better as a mental model than a literal claim. “Linux is an interpreter” feels wrong if you define interpretation strictly at the CPU instruction level, but it becomes more reasonable if you look at the kernel as something that interprets executable formats and environments (ELF, scripts with shebangs, initramfs, etc.). In that sense it’s less about instruction-by-instruction interpretation and more about orchestrating how different representations of programs become runnable. Maybe the confusion here is mixing those two meanings of “interpreter”.
[−] shevy-java 48d ago
Well - Linux is kind of like a somewhat generic interface to have actionable, programmable tasks. One could use Windows for this too, but IMO Linux is in general better suited for that task.

The only area I think Windows may be better is the graphical user interface. Now, the windows interface annoys me to no ends, but GNOME annoys me and KDE annoys me too. I have been more using fluxbox or icewm, sometimes when I feel fancy xfce or mate-desktop, but by and large I think my "hardcore desktop days" are over. I want things to be fast and efficient and simple. Most of the work I do I handle via the commandline and a bit of web-browsing and writing code/text in an editor, for the most part (say, 95% of the activities).