The original article explicitly acknowledged this limitation, that while in "the classical differential-algebraic setting, one often works with a broader notion of elementary function, defined relative to a chosen field of constants and allowing algebraic adjunctions, i.e., adjoining roots of polynomial equations," the author works with the less general definition.
Neither the present article, nor the original one has much mathematical originality, though: Odrzywolek's result is immediately obvious, while this blog post is a rehash of Arnold's proof of the unsolvability of the quintic.
Yes, this article is kicking in open doors, the original article was quite clear about the scope.
The present article could rather have spent time arguing why this isn't like NAND gate functional completeness.
I would have thought the differences lie in the other direction: not that trees of EML and 1 can describe too little, but that they can describe too much already. It's decidable whether two NAND circuits implement the same function, I'm pretty sure it's not decidable if two EML trees describe the same function.
Arnold (as reported by Goldmakher [1]) does prove the unsolvability of the quintic in finite terms of arithmetic and single-valued continuous functions (which does not include the complex logarithm). TFA's result is stronger, which is something about the solvability of the monodromy groups of all EML-derived functions. So it doesn't seem to be a "rehash", even if their specific counterexample could have been achieved either in fewer steps or with less machinery.
This may or may not be true; but the burden of proof should not lay with the reader.
Please provide (in absence of which every reader can draw their own conclusions) a reference which simultaneously:
1) predates Odrzywolek's result
2) and demonstrates the other unary and binary operations typically tacitly assumed can be expressed in terms of a single binary operation and a constant.
(in other news: I can spontaneously levitate, I just don't feel like demonstrating it to you right now...)
> My concern is that the word “elementary” in the title carries a much broader meaning in standard mathematical usage, and in this meaning, the paper’s title does not hold.
> Elementary functions typically include arbitrary polynomial roots, and EML terms cannot express them.
If you take a real analysis class, the elementary functions will be defined exactly as the author of the EML paper does.
I've actually just learnt that some consider roots of arbitrary polynomials being part of the elementary functions before, but I'm a physicist and only ever took some undergraduate mathematics classes.
Nonetheless, calling these elementary feels a bit of stretch considering that the word literally means basic stuff, something that a beginner will learn first.
Related is the paper [What is a closed-form number?], which explores the field E, defined as the smallest subfield of ℂ closed under exp and log. I believe the set of numbers that can be generated using exp-minus-log is a strict subset of this.
In a similar vein to this post, the paper points out that general polynomials do not have solutions in E, so of course exp-minus-log is similarly incomplete.
What is intriguing is that we don’t even know whether many simple equations like exp(-x) = x (i.e. the [omega constant]) have solutions in E. We of course suspect they don’t, but this conjecture is not proven: https://en.wikipedia.org/wiki/Schanuel%27s_conjecture
That's a kind of weak criticism. What functions are considered elementary was always going to be arbitrary, picking the set you can generate from exp, log, and some complex algebra is not the worst choice.
If nothing else you could solve simple differential equations with them. And it gives you the 'power' function.
The very fact that the set of functions is largely arbitrary is a much bigger issue. Or at least it limits the use of the fact that you can represent those functions.
Edit: I feel the need to add that just because it is a weak critique doesn't mean the argument itself is not interesting.
When I first read the exp-minus-log paper, I found it extremely surprising - even shocking that such a function could exist.
But the fact that a single function can represent a large number of other functions isn't that surprising at all.
It's probably obvious to anyone (it wasn't initially to me), but given enough arguments I can represent any arbitrary set of n+1 functions (they don't even have to be functions on the reals - just as long as the domain has a multiplicative zero available) as a sort of "selector":
The author essentially says that the quintic has no closed form solution which is true regardless of the exp-minus-log function. The purpose of this blog post is lost on me.
Can anyone please explain this further? It seems like he’s moving the goalposts.
It's news to me that "elementary functions" include roots of arbitrary polynomials, but the wiki article in fact says that they're included at least some of the time. I remember reading about the Risch algorithm (for finding closed form antiderivatives) a long time ago and elementary functions were just the ordinary ones found on calculators.
Interestingly, the abs (absolute value) function is non-elementary. I wonder if exp-minus-log can represent it.
This is a bit like invalidating a result based on 0^0 := 1 because you work in a field of mathematics where 0^0 is an indeterminate form. Not very interesting.
AFAIU the original paper is a result in the field of symbolic regression. What definition of elementary function do they use?
On a tangent: I've tried to connect Euclid's Elements with quantifier elimination theorems. It looks like most of the geometry follows from QE of real-closed fields. Some of the number theory relates to Presburger arithmetic. Some other number theory, including the irrationality of sqrt(2), is down to Skolem. The Pythagorean triples relate to extending Skolem to the Gaussian integers. I suspect some of the "embryonic" integral calculus could be related to holonomic functions, which seem like they admit a form of QE.
Don't have anything for the perfect numbers though.
I wish i had seen this a few days ago i spent days on this before arriving at the following:
It’s completely wrong, for multiple reasons.
First, and immediately, none of the derived total functions are the functions they appear to be, because they are all partial functions.
Worse than that the domain over which these trees are defined is undecidable by Richardson. You can encode Hilbert’s 10th with elementary functions. Which means you can’t algorithmically decide if a tree evaluates to zero so you never know if ln(y) is undefined
So the central claim is done right off the bat.
Secondly and probably more importantly, these trees have exponential blow up problems because even fairly shallow trees involve tetrations of e that don’t cancel. Even in float64 you can‘t add 800+800 without overflow. Clamping it just gives you wrong answers and clamped exp A) isn’t elementary and B) gives bounded growth so can’t generate all elementary functions.
The consequences for his claimed practical application is disastrous. the loss function is NaN everywhere once you get to around depth 6 in his trees. If you clamp it you just get a flat plateau. Even if you had infinite precision the gradient would be unusable and wouldn’t converge.
As a second note, he frequently in the paper talks about using positive reals as inputs. First, positive reals aren’t even part of his generating set (and if they are his whole central claim is wrong, it has uncountably many constants), but even if they are, fundamentally his operations are complex, and you have subtraction so it is not hard to construct trees that should be total but which never the less are undefined at undecideably many points.
Using extended reals doesn’t save it either. There are just different undefined values that lead to undecideability.
None of these arguments are really specific to EML either. Any binary operator that hopes to generate elementary functions must include exp and ln, must be partial and must be able to encode hilbert’s 10th. There is no fix for this.
You might ask why his tests didn’t find any of this but if you look at his code, they do. He just carefully restricts them to a narrow domain where they happen to work and even then he does things like drop imaginary components and filters out undefined results. He doesn‘t even really hide this in the paper, he just dismisses them as implementation details that can be fixed but the problems are structural
So what we are left with is a generator that can generate exp(x) cleanly and essentially nothing else.
118 comments
Neither the present article, nor the original one has much mathematical originality, though: Odrzywolek's result is immediately obvious, while this blog post is a rehash of Arnold's proof of the unsolvability of the quintic.
The present article could rather have spent time arguing why this isn't like NAND gate functional completeness.
I would have thought the differences lie in the other direction: not that trees of EML and 1 can describe too little, but that they can describe too much already. It's decidable whether two NAND circuits implement the same function, I'm pretty sure it's not decidable if two EML trees describe the same function.
[1] https://en.wikipedia.org/wiki/Richardson%27s_theorem
> It's decidable whether two NAND circuits implement the same function
Well, sure. At least, until you have a loop that starts clocking for you, and now you've got the halting problem.
> Neither the present article, nor the original one has much mathematical originality, though: Odrzywolek's result is immediately obvious, [...]
Maybe. But I found it a nice piece of recreational mathematics nevertheless.
[1] https://web.williams.edu/Mathematics/lg5/394/ArnoldQuintic.p...
> Odrzywolek's result is immediately obvious
Many things that in retrospect seem immediately obvious weren't obvious before, let alone immediately obvious.
> Odrzywolek's result is immediately obvious
This may or may not be true; but the burden of proof should not lay with the reader.
Please provide (in absence of which every reader can draw their own conclusions) a reference which simultaneously:
1) predates Odrzywolek's result
2) and demonstrates the other unary and binary operations typically tacitly assumed can be expressed in terms of a single binary operation and a constant.
(in other news: I can spontaneously levitate, I just don't feel like demonstrating it to you right now...)
> My concern is that the word “elementary” in the title carries a much broader meaning in standard mathematical usage, and in this meaning, the paper’s title does not hold.
> Elementary functions typically include arbitrary polynomial roots, and EML terms cannot express them.
If you take a real analysis class, the elementary functions will be defined exactly as the author of the EML paper does.
I've actually just learnt that some consider roots of arbitrary polynomials being part of the elementary functions before, but I'm a physicist and only ever took some undergraduate mathematics classes. Nonetheless, calling these elementary feels a bit of stretch considering that the word literally means basic stuff, something that a beginner will learn first.
In a similar vein to this post, the paper points out that general polynomials do not have solutions in E, so of course exp-minus-log is similarly incomplete.
What is intriguing is that we don’t even know whether many simple equations like exp(-x) = x (i.e. the [omega constant]) have solutions in E. We of course suspect they don’t, but this conjecture is not proven: https://en.wikipedia.org/wiki/Schanuel%27s_conjecture
What is a closed-form number?: http://timothychow.net/closedform.pdf omega constant: https://en.wikipedia.org/wiki/Omega_constant
If nothing else you could solve simple differential equations with them. And it gives you the 'power' function.
The very fact that the set of functions is largely arbitrary is a much bigger issue. Or at least it limits the use of the fact that you can represent those functions.
Edit: I feel the need to add that just because it is a weak critique doesn't mean the argument itself is not interesting.
But the fact that a single function can represent a large number of other functions isn't that surprising at all.
It's probably obvious to anyone (it wasn't initially to me), but given enough arguments I can represent any arbitrary set of n+1 functions (they don't even have to be functions on the reals - just as long as the domain has a multiplicative zero available) as a sort of "selector":
g(x_0, c_0, x_1, c_1, ... , x_n, c_n) = c_0 * f_0(x_0) + ... + c_n * f_n(x_n)
The trick is to minimize the number of arguments and complexity of the RHS - but that there's a trivial upper-bound (in terms of number of arguments).
Can anyone please explain this further? It seems like he’s moving the goalposts.
> Elementary functions typically include arbitrary polynomial roots
Admittedly this may be above my math level, but this just seems like a bad definition of elementary functions, given the context.
Interestingly, the abs (absolute value) function is non-elementary. I wonder if exp-minus-log can represent it.
AFAIU the original paper is a result in the field of symbolic regression. What definition of elementary function do they use?
Don't have anything for the perfect numbers though.
It’s completely wrong, for multiple reasons.
First, and immediately, none of the derived total functions are the functions they appear to be, because they are all partial functions.
Worse than that the domain over which these trees are defined is undecidable by Richardson. You can encode Hilbert’s 10th with elementary functions. Which means you can’t algorithmically decide if a tree evaluates to zero so you never know if ln(y) is undefined
So the central claim is done right off the bat.
Secondly and probably more importantly, these trees have exponential blow up problems because even fairly shallow trees involve tetrations of e that don’t cancel. Even in float64 you can‘t add 800+800 without overflow. Clamping it just gives you wrong answers and clamped exp A) isn’t elementary and B) gives bounded growth so can’t generate all elementary functions.
The consequences for his claimed practical application is disastrous. the loss function is NaN everywhere once you get to around depth 6 in his trees. If you clamp it you just get a flat plateau. Even if you had infinite precision the gradient would be unusable and wouldn’t converge.
As a second note, he frequently in the paper talks about using positive reals as inputs. First, positive reals aren’t even part of his generating set (and if they are his whole central claim is wrong, it has uncountably many constants), but even if they are, fundamentally his operations are complex, and you have subtraction so it is not hard to construct trees that should be total but which never the less are undefined at undecideably many points.
Using extended reals doesn’t save it either. There are just different undefined values that lead to undecideability.
None of these arguments are really specific to EML either. Any binary operator that hopes to generate elementary functions must include exp and ln, must be partial and must be able to encode hilbert’s 10th. There is no fix for this.
You might ask why his tests didn’t find any of this but if you look at his code, they do. He just carefully restricts them to a narrow domain where they happen to work and even then he does things like drop imaginary components and filters out undefined results. He doesn‘t even really hide this in the paper, he just dismisses them as implementation details that can be fixed but the problems are structural
So what we are left with is a generator that can generate exp(x) cleanly and essentially nothing else.
Tests for the trig functions aren't passing yet due to an issue with the derived eml form in some mirrored cases.
Also I'd be glad to see a specific example of a function, considered elementary, which is not representable by EML.
It could be hard, and in any case, thanks for the article. I wish it would be more accessible to me.
https://en.wikipedia.org/wiki/Template:Mathematical_expressi...