This is amazing! I love seeing FRACTRAN-shaped things on the homepage :) This reminds me of how 1-bit stacks are encoded in binary:
A stack of zeros and ones can be encoded in a single number by keeping with bit-shifting and incrementing.
Pushing a 0 onto the stack is equivalent to doubling the number.
Pushing a 1 is equivalent to doubling and adding 1.
Popping is equivalent to dividing by 2, where the remainder is the number.
I use something not too far off for my daily a programming based on a similar idea:
Rejoice is a concatenative programming language in which data is encoded as multisets that compose by multiplication. Think Fractran, without the rule-searching, or Forth without a stack.
You missed the part where the number is being used as a 1-bit stack. They never claimed novelty, it's just a neat technique that might be unfamiliar to most people.
This isn't unique, or even the least compute way to do this. For example, let f(x,y) = 1/(x-y). This too is universal. I think there's a theorem stating for any finite set of binary operators there is a single one replacing it.
write x#y for 1/(x-y).
x#0 = 1/(x-0) = 1/x, so you get reciprocals.
Then (x#y)#0 = 1/((1/(x-y)) - 0) = x-y, so subtraction.
it's common problem to show in any (insert various algebraic structure here ) inverse and subtraction gives all 4 elementary ops.
Ooh, that 2nd link has a nice construction by Terry Tao giving a clear way to show infinitely many such functions exist for pretty much any set of operations.
I’m fairly certain that the difference between the approaches is that the f(x,y) function you mentioned requires limits to represent certain concepts while the eml approach is essentially a tree or a chain of computations meant to represent a model of a system.
They're not series, that's just a convenient way to think about defining and calculating them. I've never found it particularly useful to deal with the series definitions either, and none of the (good) approximation methods I'm aware of actually take that approach.
Moreover, EML is complete in a way that your suggested function isn't: If you take a finite combination of basis functions, can it build periodic functions? Hardy proved over a century ago that real (+,-,/,*,exp,ln) can't do this (and answering the paper's unresolved question about similar real-valued functions in the negative). EML being able to build periodic functions is a lot less surprising for obvious reasons, but still pretty neat.
EML goes through complex numbers and infinities internally.
Previous commenter meant that there are infinite series inside log and exp.
To achieve a thing like sin(x) from his universal expression, yes you'd need infinite series of those, not a finite set of operations, but to get all 36 basic function you'd need a finite set of different infinite series.
So exp and log just happen to be the labels for two element set of such infinite series that you'd have to use to make 36 functions out of the operator that pervious commenter proposed.
All that said, I still think it's pretty neat too.
I understand your point. The paper is more about the depth of the tree to represent and audit a model versus the raw CPU clock cycles. It takes the exponent and logarithm as given since for all practical purposes, in a scientific context, they are.
To represent something like sin(x) with f(x,y) requires infinite steps. Conversely, with eml you get an exact result in around 4 using identities and such.
One could argue that we do Taylor Series approximations on the hardware to represent trigonometric functions, but that highlights the key aspect of the eml approach. You can write a paper with those four steps that describes an exact model, here sin(x). And people can take that paper and optimize the result.
This paper is about an auditable grammar that you can compute with.
You can reduce all Boolean logic to NAND, but that doesn't actually mean that semiconductor fabs translate their HDL to NAND gates, because it is possible to build complex gates that directly implement higher level operations in a single gate.
Your "cost of computation" objection can be easily resolved by adding more operators, which makes it boring from a research perspective.
Meanwhile the loss of expressivity can only be compensated by encoding algorithms directly into the expression tree. Your objection that an infinite series is a bad thing rings hollow, since you now introduce the concept of an infinitely sized expression tree. That sounds much more impractical than implementing an algorithm for the exponential and logarithm functions.
Show me a way to physically compute exp or ln that is less gates than add. More gates means costlier in $, more energy in compute, and for these functions, higher latency.
You don’t get to make up free ops, claim there is no cost in reality, and hand wave away reality.
There are infinitely many ways to do what the paper did. There’s no gain other than it’s pretty. It loses on every practical front to simply using current ops and architectures.
Conceptual elegance is worth something, isn't it? I don't mean just aesthetic pleasure, as in recreational mathematics, but there's often (not always) value in being able to succinctly describe a wide range of phenomena with a small number of primitives. It could open up new ways of understanding or working that wasn't possible before. Not saying this specific discovery fits the description, but it seems too early to dismiss the idea entirely based on its im/practicality compared to existing solutions.
Aren't there examples in the history of mathematics, where a new idea was criticized for being impractical, then later found to have applications or implications, possibly unexpected even to the person who discovered it?
The feasibility of memristor analog circuits is evident, and I believe this paper represents a valuable early exploration. We've been constrained by Boolean logic for quite some time now.
> This includes constants such as e, pi, and i; arithmetic operations including addition, subtraction, multiplication, division, and exponentiation as well as the usual transcendental and algebraic functions.
You forgot to create numbers. You need 0 (explicitly used in your construction), and you need another number so you generate numbers that are not 0 and infinity.
I don't think this can do any of the "standard" constants or what we generally consider to be closed-form expressions, though ! (E.g., no e, pi, exp, log, etc.)
> A calculator with just two buttons, EML and the digit 1, can compute
everything a full scientific calculator does
Reminds me of the Iota combinator, one of the smallest formal systems that can be combined to produce a universal Turing machine, meaning it can express all of computation.
EDIT: please change the article link to the most recent version (as of now still v2), it is currently pointing to the v1 version which misses the figures.
I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.
Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?
Got a multidimensional and multivariate function to model (with random samples or a full map)? Just do gradient descent and convert it to approximant EML trees.
Perform gradient descent on EML function tree "phi" so that the derivatives in the Schroedinger equation match.
But as I said, still reading, this sounds too good to be true, but I have witnessed such things before :)
All possible 36 distinct level-2 eml functions of one variable (the first 18 of them with entirely Real outputs, the other 18 with "intermediate" complex-valued components):
derivation of -x seems wrong. we can look at the execution trace on a stack machine, but it's actually not hard to see. starting from the last node before the output, we see that the tree has the form
and the claim is that, after it's expanded, z will be such that this whole thing is equal to -x. but with some algebra, this is happening only if
e^z = 0,
and there is no complex number z that satisfies this equation. indeed if we laboriously expand the given formula for z (the left branch of the tree), we see that it goes through ln(0), and compound expressions.
x^-1 has the same problem.
both formulae work ...sort of... if we allow ln(0) = Infinity and some other moxie, such as x / Infinity = 0 for all finite x.
For completeness, there is also Peirce’s arrow aka NOR operation which is functionally complete. Fun applications iirc VMProtect copy protection system has an internal VM based on NOR.
> using EML trees as trainable circuits ..., I demonstrate the feasibility of exact recovery of closed-form elementary functions from numerical data at shallow tree depths up to 4
That's awesome. I always wondered if there is some way to do this.
I made a fun marimo notebook to try and derive these myself. I structured each cell in order based on the diagram at the end of the paper. It uses Sympy to determine if the function is correct or not.
1. One should also add absolute value (as sqrt(x*x)?) as a desired function and from that min, max, signum in the available functions. Since the domain is complex some of them will be a bit weird, I am not sure.
2. I think, for any bijective function f(x) which, together with its inverse, is expressible using eml(), we can obtain another universal basis eml(f(x),f(y)) with the added constant f^-1(1). Interesting special case is when f=exp or f=ln. (This might also explain the EDL variant.)
3. The eml basis uses natural logarithm and exponent. It would be interesting to see if we could have a basis with function 2^x - log_2(y) and constants 1 and e (to create standard mathematical functions like exp,ln,sin...). This could be computationally more feasible to implement. As a number representation, it kinda reminds me of https://en.wikipedia.org/wiki/Elias_omega_coding.
4. I would like to see an algorithm how to find derivatives of the eml() trees. This could yield a rather clear proof why some functions do not have indefinite integrals in a symbolic form.
5. For some reason, extending the domain to complex numbers made me think about fuzzy logics with complex truth values. What would be the logarithm and exponential there? It could unify the Lukasiewicz and product logics.
Can someone explain how is this different from lambda calculus, it seems like you can derive the same in both. I don't understand both well enough and hence the question.
Stupid question maybe (I am no mathematician), but aren't exp and ln really primitives? Aren't they implemented in terms of +,-,/,* etc? Or do we assume that we have an infinite lookup table for all possible inputs?
>"Single, reusable primitives play a disproportionately large aesthetic and practical role in mathematics, engineering, and even biology. Widely known classical examples include the NAND gate (and its dual, Peirce Arrow, logical NOR) for Boolean 0/1 logic [2, 12], the operational amplifier [13] for positive and negative feedback processes, and, more recently, the rectified linear unit (ReLU) ”ramp” activation function [14] in deep learning [15]. We also mention Wolfram’s single axiom [16], K,S combinators from combinatory logic [17, 18],
Interaction Combinators [19], and fuzzy versions of the Sheffer stroke [20]. Other wellknown examples are one-instruction set computers (OISC), e.g. SUBLEQ [21], Conway’s FRACTRAN [22] and the Rule 110 cellular automaton [16, 23]."
Built a JS implementation today — monogate
https://explorer-taupe-five.vercel.app · npm install monogate
The interesting engineering problem was negation. The paper's SI gives one construction; we independently derived a two-regime approach — tower formula for y≤0, shift formula for y>0 — that stays stable to |y|<708 in IEEE 754.
We also extended to ℂ. After seeing pveierland's result in this thread (i constructible from {1} in K=75 under extended-reals convention), we investigated and documented the distinction: under strict principal-branch ln where ln(0) throws, whether {1} alone generates i remains open. Under the extended-reals convention used in the paper, their construction holds. Two different grammars, not contradictory results.
109 tests. MIT. github.com/almaguer1986/monogate
This article inspired us to test whether EML trees can be trained using gradient descent for symbolic regression: instead of searching for the right tree, is it possible to optimize one from start to finish?
We implemented the EML trees as a differentiable module of PyTorch. Each leaf is a softmax mixture over {0, 1, x}, and the tree is evaluated from the bottom up. The entire system can be trained with Adam.
Results: 7 of the 7 elementary functions (exp, ln, sqrt, x², x³, 1/x, sin) converge with ≤24 parameters at depth 3. Three (exp, ln, sqrt) achieve an RMSE < 0.005.
The main challenge is depth scaling. Random initialization at depth 4 always diverges: the exp() strings create towering exponential growth that cancels out the gradients. We tested 12 initialization strategies; only hierarchical hot-starting (training depth n-1 first) works. sin(x²) gets 12.9x better at depth 4 vs depth 3.
Two honest negative results: (1) The trained trees use continuous softmax mixtures, not discrete leaf assignments; therefore, numerical approximations, not exact formulas, are obtained for everything except exp(x). (2) A 49-parameter MLP and PySR outperform it in MSE by orders of magnitude. It's not a practical tool — it just shows that gradient descent can work on S → 1 | eml(S, S) without needing sin, exp, etc. as primitives.
Not sure it really compares to NAND() and the likes.
Simply because bool algebra doesn't have that many functions and all of them are very simple to implement.
A complex bool function made out of NANDs (or the likes) is little more complex than the same made out of the other operators.
Implementing even simple real functions out of eml() seems to me to add a lot of computational complexity even with both exp() and ln() implemented in hardware in O(1).
I think about stuff sum(), div() and mod().
Of course, I might be badly wrong as I am not a mathematician (not even by far).
But I don't see, at the moment, the big win on this.
Halfway through I was imagining aliens to whom this operator comes naturally and our math is weird. By the end I found out that we might be those aliens.
This could have some interesting hardware implications as well - it suggests that a large dedicated silicon instruction set could accelerate any mathematical algorithm provided it can be mapped to this primitive. It also suggests a compiler/translation layer should be possible as well as some novel visualization methods for functions and methods.
I'm way too unschooled to say if it's important or not, but what really excites me is the Catalan structure ("Every EML expression is a binary tree [...] isomorphic to well-studied combinatorial objects like full binary trees and Catalan objects").
So, what happens if you take say the EML expression for addition, and invert the binary tree?
I couldn't find any information on this, but is it possible that given how nicely exponentiation and logarithms differentiate and integrate, is it possible that this operator may be useful to simplify the process of finding symbolic solutions to integrals and derivatives?
this guy just took it from here: Chaotic Systems as Transcendental Structures: A Spiral Calculus Proof Framework
https://www.researchgate.net/publication/391848560_Chaotic_S...
He took my proposed equation
Σ(t) = e −Dt C ix(t)
Log to both sides and called eml
This looks interesting. I haven't looked in-detail, but my first thought is - why hasn't this been found in the past? Surely, people have been interested kind of question for awhile?
The problem with symbolic regression is ln(y) is undefined at 0, so you can't freely generate expressions with it. We need to guard it with something like ln(1+y*y) or ln(1+|y|) or return undefined.
this is really cool! i am going to try this in some of my own research. another cool thing is trying units different than 1 like adding imaginaries or other funky things, or different bases entirely. you can combine that concept with this for a lot of powerful math techniques.
Got curious to see whether SymPy could be used to evaluate the expressions, so I used Claude Code to build a quick evaluator. Numeric and symbolic results appear to agree:
298 comments
A stack of zeros and ones can be encoded in a single number by keeping with bit-shifting and incrementing.
I use something not too far off for my daily a programming based on a similar idea:Rejoice is a concatenative programming language in which data is encoded as multisets that compose by multiplication. Think Fractran, without the rule-searching, or Forth without a stack.
https://wiki.xxiivv.com/site/rejoice
https://en.wikipedia.org/wiki/Bitwise_operation#Bit_shifts
https://paste.sr.ht/~rabbits/cd2369cc7c72bfad0fcd83e27682095...
write x#y for 1/(x-y).
x#0 = 1/(x-0) = 1/x, so you get reciprocals. Then (x#y)#0 = 1/((1/(x-y)) - 0) = x-y, so subtraction.
it's common problem to show in any (insert various algebraic structure here ) inverse and subtraction gives all 4 elementary ops.
I haven't checked this carefully, but this note seems to give a short proof (modulo knowing some other items...) https://dmg.tuwien.ac.at/goldstern/www/papers/notes/singlebi...
It cites a paper from 1935: https://www.pnas.org/doi/10.1073/pnas.21.5.252
Here is a bit more: https://mathoverflow.net/questions/57465/can-we-unify-additi...
Moreover, EML is complete in a way that your suggested function isn't: If you take a finite combination of basis functions, can it build periodic functions? Hardy proved over a century ago that real (+,-,/,*,exp,ln) can't do this (and answering the paper's unresolved question about similar real-valued functions in the negative). EML being able to build periodic functions is a lot less surprising for obvious reasons, but still pretty neat.
Previous commenter meant that there are infinite series inside log and exp.
To achieve a thing like sin(x) from his universal expression, yes you'd need infinite series of those, not a finite set of operations, but to get all 36 basic function you'd need a finite set of different infinite series.
So exp and log just happen to be the labels for two element set of such infinite series that you'd have to use to make 36 functions out of the operator that pervious commenter proposed.
All that said, I still think it's pretty neat too.
To represent something like sin(x) with f(x,y) requires infinite steps. Conversely, with eml you get an exact result in around 4 using identities and such.
One could argue that we do Taylor Series approximations on the hardware to represent trigonometric functions, but that highlights the key aspect of the eml approach. You can write a paper with those four steps that describes an exact model, here sin(x). And people can take that paper and optimize the result. This paper is about an auditable grammar that you can compute with.
You can reduce all Boolean logic to NAND, but that doesn't actually mean that semiconductor fabs translate their HDL to NAND gates, because it is possible to build complex gates that directly implement higher level operations in a single gate.
Your "cost of computation" objection can be easily resolved by adding more operators, which makes it boring from a research perspective.
Meanwhile the loss of expressivity can only be compensated by encoding algorithms directly into the expression tree. Your objection that an infinite series is a bad thing rings hollow, since you now introduce the concept of an infinitely sized expression tree. That sounds much more impractical than implementing an algorithm for the exponential and logarithm functions.
You don’t get to make up free ops, claim there is no cost in reality, and hand wave away reality.
There are infinitely many ways to do what the paper did. There’s no gain other than it’s pretty. It loses on every practical front to simply using current ops and architectures.
> no gain other than it’s pretty
Conceptual elegance is worth something, isn't it? I don't mean just aesthetic pleasure, as in recreational mathematics, but there's often (not always) value in being able to succinctly describe a wide range of phenomena with a small number of primitives. It could open up new ways of understanding or working that wasn't possible before. Not saying this specific discovery fits the description, but it seems too early to dismiss the idea entirely based on its im/practicality compared to existing solutions.
Aren't there examples in the history of mathematics, where a new idea was criticized for being impractical, then later found to have applications or implications, possibly unexpected even to the person who discovered it?
> Show me a way to physically compute exp or ln that is less gates than add.
IIRC a resistor in series to a capacitor does the trick, for exp.
> This includes constants such as e, pi, and i; arithmetic operations including addition, subtraction, multiplication, division, and exponentiation as well as the usual transcendental and algebraic functions.
The OP only needs 1 number: 1.
> This too is universal
Could that be used to derive trigonometric functions with single distinct expressions?
``
look at this paper: https://arxiv.org/pdf/2603.21852``now please produce 2x+y as a composition on EMLs
Opus(paid) - claimed that "2" is circular. Once I told it that ChatGPT have already done this, finished successfully.
ChatGPT(free) - did it from the first try.
Grok - produced estimation of the depth of the formula.
Gemini - success
Deepseek - Assumed some pre-existing knowledge on what EML is. Unable to fetch the pdf from the link, unable to consume pdf from "Attach file"
Kimi - produced long output, stopped and asked to upgrade
GLM - looks ok
> A calculator with just two buttons, EML and the digit 1, can compute everything a full scientific calculator does
Reminds me of the Iota combinator, one of the smallest formal systems that can be combined to produce a universal Turing machine, meaning it can express all of computation.
I'm still reading this, but if this checks out, this is one of the most significant discoveries in years.
Why use splines or polynomials or haphazardly chosen basis functions if you can just fit (gradient descent) your data or wave functions to the proper computational EML tree?
Got a multidimensional and multivariate function to model (with random samples or a full map)? Just do gradient descent and convert it to approximant EML trees.
Perform gradient descent on EML function tree "phi" so that the derivatives in the Schroedinger equation match.
But as I said, still reading, this sounds too good to be true, but I have witnessed such things before :)
https://imgur.com/a/K7AoOFi
x^-1 has the same problem.
both formulae work ...sort of... if we allow ln(0) = Infinity and some other moxie, such as x / Infinity = 0 for all finite x.
Quick google seach brings up https://github.com/pr701/nor_vm_core, which has a basic idea
> using EML trees as trainable circuits ..., I demonstrate the feasibility of exact recovery of closed-form elementary functions from numerical data at shallow tree depths up to 4
That's awesome. I always wondered if there is some way to do this.
> For example, exp(x)=eml(x,1), ln(x)=eml(1,eml(eml(1,x),1)), and likewise for all other operations
I read the paper. Is there a table covering all other math operations translated to eml(x,y) form?
https://gist.github.com/CGamesPlay/9d1fd0a9a3bd432e77c075fb8...
Few ideas that come to my mind when reading this:
1. One should also add absolute value (as sqrt(x*x)?) as a desired function and from that min, max, signum in the available functions. Since the domain is complex some of them will be a bit weird, I am not sure.
2. I think, for any bijective function f(x) which, together with its inverse, is expressible using eml(), we can obtain another universal basis eml(f(x),f(y)) with the added constant f^-1(1). Interesting special case is when f=exp or f=ln. (This might also explain the EDL variant.)
3. The eml basis uses natural logarithm and exponent. It would be interesting to see if we could have a basis with function 2^x - log_2(y) and constants 1 and e (to create standard mathematical functions like exp,ln,sin...). This could be computationally more feasible to implement. As a number representation, it kinda reminds me of https://en.wikipedia.org/wiki/Elias_omega_coding.
4. I would like to see an algorithm how to find derivatives of the eml() trees. This could yield a rather clear proof why some functions do not have indefinite integrals in a symbolic form.
5. For some reason, extending the domain to complex numbers made me think about fuzzy logics with complex truth values. What would be the logarithm and exponential there? It could unify the Lukasiewicz and product logics.
Posts like these are the reason i check HN every day
>"Single, reusable primitives play a disproportionately large aesthetic and practical role in mathematics, engineering, and even biology. Widely known classical examples include the NAND gate (and its dual, Peirce Arrow, logical NOR) for Boolean 0/1 logic [2, 12], the operational amplifier [13] for positive and negative feedback processes, and, more recently, the rectified linear unit (ReLU) ”ramp” activation function [14] in deep learning [15]. We also mention Wolfram’s single axiom [16], K,S combinators from combinatory logic [17, 18], Interaction Combinators [19], and fuzzy versions of the Sheffer stroke [20]. Other wellknown examples are one-instruction set computers (OISC), e.g. SUBLEQ [21], Conway’s FRACTRAN [22] and the Rule 110 cellular automaton [16, 23]."
It’s a derivation of the Y combinator from ruby lambdas
https://www.wolframscience.com/nks/notes-4-5--operator-repre...
We implemented the EML trees as a differentiable module of PyTorch. Each leaf is a softmax mixture over {0, 1, x}, and the tree is evaluated from the bottom up. The entire system can be trained with Adam.
Results: 7 of the 7 elementary functions (exp, ln, sqrt, x², x³, 1/x, sin) converge with ≤24 parameters at depth 3. Three (exp, ln, sqrt) achieve an RMSE < 0.005.
The main challenge is depth scaling. Random initialization at depth 4 always diverges: the exp() strings create towering exponential growth that cancels out the gradients. We tested 12 initialization strategies; only hierarchical hot-starting (training depth n-1 first) works. sin(x²) gets 12.9x better at depth 4 vs depth 3.
Two honest negative results: (1) The trained trees use continuous softmax mixtures, not discrete leaf assignments; therefore, numerical approximations, not exact formulas, are obtained for everything except exp(x). (2) A 49-parameter MLP and PySR outperform it in MSE by orders of magnitude. It's not a practical tool — it just shows that gradient descent can work on S → 1 | eml(S, S) without needing sin, exp, etc. as primitives.
Paper: https://doi.org/10.5281/zenodo.19592926 Code: https://github.com/seetrex-ai/monolith
Simply because bool algebra doesn't have that many functions and all of them are very simple to implement.
A complex bool function made out of NANDs (or the likes) is little more complex than the same made out of the other operators.
Implementing even simple real functions out of eml() seems to me to add a lot of computational complexity even with both exp() and ln() implemented in hardware in O(1). I think about stuff sum(), div() and mod().
Of course, I might be badly wrong as I am not a mathematician (not even by far).
But I don't see, at the moment, the big win on this.
https://github.com/howerj/muxleq
PD: you don't need gforth to compile:
Edit muxleq.fth, add some goodies by editing these values: Here are my settings.Then, create a new ".dec" file (subleq program)
Now, to run EForth everytime: add these further in muxleq.fth code to have them: The ideal place would be just below the ': dabs ' defition.So, what happens if you take say the EML expression for addition, and invert the binary tree?
> eml(x,y)=exp(x)-ln(y)
Exp and ln, isn't the operation its own inverse depending on the parameter? What a neat find.
But no...
This is about continuous math, not ones and zeroes. Assuming peer review proves it out, this is outstanding.
What comes to my mind as an alternative which I would subjectivity finer is "axe". Think axiom or axiology.
Anyone with other suggestions? Or even remarks on this one?
One thing I wonder now: NAND is symmetric while this isn't, could something similar be found where function(x, y) = function(y, x)?