What every computer scientist should know about floating-point arithmetic (1991) [pdf] (itu.dk)

by jbarrow • 56 comments • 125 points

56 comments

[−] tomhow 62d ago

What Every Computer Scientist Should Know About Floating-Point Arithmetic (1991) - https://news.ycombinator.com/item?id=23665529 - June 2020 (85 comments)

What Every Computer Scientist Should Know About Floating-Point Arithmetic - https://news.ycombinator.com/item?id=3808168 - April 2012 (3 comments)

What Every Computer Scientist Should Know About Floating-Point Arithmetic - https://news.ycombinator.com/item?id=1982332 - Dec 2010 (14 comments)

What Every Computer Scientist Should Know About Floating-Point Arithmetic - https://news.ycombinator.com/item?id=1746797 - Oct 2010 (2 comments)

Weekend project: What Every Programmer Should Know About FP Arithmetic - https://news.ycombinator.com/item?id=1257610 - April 2010 (9 comments)

What every computer scientist should know about floating-point arithmetic - https://news.ycombinator.com/item?id=687604 - July 2009 (2 comments)

[−] randusername 61d ago

For anyone turned off by this document and its proofs, I recommend Numerical Methods for Scientists and Engineers (Hamming). Still a math text, but more approachable.

The five key ideas from that book, enumerated by the author:

(1) the purpose of computing is insight, not numbers

(2) study families and relationships of methods, not individual algorithms

(3) roundoff error

(4) truncation error

(5) instability

[−] ses1984 61d ago

Can you elaborate a little on what 1 is supposed to mean?

[−] randusername 61d ago

> This motto is often thought to mean that the numbers from a computing machine should be read and used, but there is much more to the motto. The choice of the particular formula, or algorithm, influences not only the computing but also how we are to understand the results when they are obtained. The way the computing progresses, the number of iterations it requires, or the spacing used by a formula, often sheds light on the problem...

Thus computing is, or at least should be, intimately bound up with both the source of the problem and the use that is going to be made of the answers-- it is not a step to be taken in isolation from reality

(From "An Essay on Numerical Methods" p 3 of the mentioned text; emphasis authors)

[−] anymouse123456 61d ago

Not the OP, but I suspect it means focus on what questions are being asked first, and even then, look for opportunities to simplify wherever you find them.

So many of us spend so much time getting enamoured with technical solutions to problems that no one cares about.

[−] lifthrasiir 62d ago

(1991). This article is also available in HTML: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h...

[−] jbarrow 61d ago

Shared this because I was having fun thinking through floating point numbers the other day.

I worked through what fp6 (e3m2) would look like, doing manual additions and multiplications, showing cases where the operations are non-associative, etc. and then I wanted something more rigorous to read.

For anyone interested in floating point numbers, I highly recommend working through fp6 as an activity! Felt like I truly came away with a much deeper understanding of floats. Anything less than fp6 felt too simple/constrained, and anything more than fp6 felt like too much to write out by hand. For fp6 you can enumerate all 64 possible values on a small sheet of paper.

For anyone not (yet) interested in floating point numbers, I’d still recommend giving it a shot.

[−] atoav 61d ago

One thing that really did it for me was programming something where you would normally use floats (audio/DSP) on a platform where floats were abysmally slow. This forced me to explore Fixed-Point options which in turn forced me to explore what the differences to floats are.

[−] jacquesm 61d ago

Fixed point gave rise to the old programmers meme 'if you need floating point you don't understand your problem'. It's of course partially in jest but there is a grain of truth in it as well.

[−] adampunk 61d ago

It is quite old, attributable to VonNeumann and Goldstine in 1947. Later Goldstine joked that if rescaling for every step was easy enough for Johnny it ought to be easy for everyone else.

The gag here being that perhaps that isn’t the best dividing line for programming talent.

[−] jacquesm 61d ago

That's hilarious.

It's like slicing off the top 0.0001% of mt. Everest and saying that you have evenly split the world.

[−] adampunk 61d ago

It gets WORSE. Here's a quote from “The Birth of a Computer” in BYTE Magazine, February 1985, an interview with J.H. Wilkinson, noted numerical slouch, on the Manchester machines ca 1949 (p. 178):

>They were fixed point, but one of the earliest things that I did (at Turing’s request) was to program a set of subroutines for doing floating-point arithmetic.

So we ought to scale to better ourselves with self-study, meanwhile one of the first errands TURING send WILKINSON on was to rid themselves of this duty. ;)

[−] jacquesm 60d ago

The 'numerical slouch' bit had me in stitches.

It's interesting how many of these things we take for granted.

I'm working (and have been for a while) on something that requires both ridiculous precision and speed on a relatively puny power budget and it's been a really nice trip down memory lane regarding optimization. I discovered fixed point pretty early in my programming career when doing 3D graphics on the 6502. I never imagined that that knowledge would come in handy more than almost five decades later, but here we are.

[−] KeplerBoy 61d ago

Also heavily used in FPGA based DSP.

[−] fusionadvocate 61d ago

"Nothing brings fear to my heart more than a floating point number." Gerald Jay Sussman.

[−] nxobject 61d ago

Not-so-coincidentally, Scheme specifies an exact rational number type. [0] This does not simplify things. [1]

[0] https://www-sop.inria.fr/indes/fp/Bigloo/doc/r5rs-9.html#Num...

[1] https://www.deinprogramm.de/sperber/papers/numerical-tower.p...

[−] emil-lp 61d ago

    > 0.1 + 0.1 + 0.1 == 0.3
    
    False

I always tell my students that if they (might) have a float, and are using the == operator, they're doing something wrong.

[−] zokier 61d ago

That has more to do with decimal <-> binary conversion than arithmetic/comparison. Using hex literals makes it clearer

     0x1.999999999999ap-4 ("0.1")
    +0x1.999999999999ap-4 ("0.1")
    ---------------------
    =0x3.3333333333334p-4 ("0.2")
    +0x1.999999999999ap-4 ("0.1")
    ---------------------
    =0x4.cccccccccccf0p-4 ("0.30000000000000004")
    !=0x4.cccccccccccccp-4 ("0.3")

[−] jacquesm 61d ago

Absolutely nobody will think this is 'clearer', this is a leaky abstraction and personally I think that the OP is right and == in combination with floating point constants should be limited to '0' and that's it.

[−] pdpi 61d ago

We all know that 1/3 + 1/3 + 1/3 = 1, but 0.33 + 0.33 + 0.33 = 0.99. We're sufficiently used to decimal to know that 1/3 doesn't have a finite decimal representation. Decimal 1/10 doesn't have a finite binary representation, for the exact same reason that 1/3 doesn't have one in decimal — 3 is co-prime with 10, and 5 is co-prime with 2.

The only leaky abstraction here is our bias towards decimal. (Fun fact: "base 10" is meaningless, because every base calls itself base 10)

[−] 1718627440 61d ago

> Fun fact: "base 10" is meaningless, because every base calls itself base 10

Maybe we should name the bases by the largest digit they have, so that we are using base 9 most of the time.

[−] brandmeyer 61d ago

Repeating the exercise with something that is exactly representable in floating point like 1/8 instead of 1/10 highlights the difference.

[−] brandmeyer 61d ago

> they (might) have a float, and are using the

== operator, they're doing something wrong.

Storage, retrieval, transmission, and serialization/deserialization systems should be able to transmit and round-trip floats without losing any bits at all.

[−] materialpoint 61d ago

Well, there are many legitimate cases for using the equality operator. Insisting someone is doing something wrong is downright wrong and you shouldn't be teaching floating-point numbers. A few use cases are: Floating-points differing from default or initial values and carrying meaning, e.g. 0 or 1 translates to omitting entire operations. Then there is also the case for measuring the tinyest possible variation when using relative tolerances are not what you want. Not exhaustive. If you use == with fp, it only means you should've thought about it thoroughly.

[−] magicalhippo 61d ago

I also like how a / b can result in infinity even if both a and b are strictly non-zero[1]. So be careful rewriting floating-point expressions.

[1]: https://www.cs.uaf.edu/2011/fall/cs301/lecture/11_09_weird_f... (division result matrix)

[−] SideQuark 61d ago

There’s plenty of cases where ‘==‘ is correct. If you understand how floating point numbers work at the same depth you understand integers, then you may know the result of each side and know there’s zero error.

Anything to do “approximately close” is much slower, prone to even more subtle bugs (often trading less immediate bugs for much harder to find and fix bugs).

For example, I routinely make unit tests with inputs designed so answers are perfectly representable, so tests do bit exact compares, to ensure algorithms work as designed.

I’d rather teach students there’s subtlety here with some tradeoffs.

[−] kccqzy 61d ago

I have a relaxed rule for myself: if I’m using the == operator on floats, I must write a comment explaining why. I use == for maybe once a year.

[−] jmalicki 61d ago

.125 + .375 == .5

You should be using == for floats when they're actually equal. 0.1 just isn't an actual number.

[−] emil-lp 61d ago

Are you saying that my students should memorize which numbers are actual floats and which are not?

    > 1.25 * 0.1
    
    0.1250000000000000069388939039

[−] jmalicki 61d ago

> Are you saying that my students should memorize which numbers are actual floats and which are not?

Yes.

[−] stephencanon 61d ago

Your students should be able to figure out if a computation is exact or not, because they should understand binary representation of numbers.

[−] jmalicki 61d ago

(Another small note.... 1.25 * 0.1 is not representable because 0.1 is not representable, so that doesn't divide by 10)

1.25 = 2^0 + 2^-2, so is representable.

0.125 = 2^-3, so is representable

1.25 / 10.0 = 0.125 so is representable. 10.0 = 2^3 + 2^1.

1.25 * 0.1 is not representable, because 0.1 is not representable, and those low order bits show up in the multiplication

[−] SideQuark 61d ago

If they were taught what was representable and why they’d learn it quickly. And those that forget details later know to chase it down again if they need it. Making it voodoo hides that it’s learnable, deterministic, and useful to understand.

[−] dehrmann 61d ago

Tell them that they can only store integer powers of 2 and their sums exactly. 2^0 == 1. 2^-2 == .25. Then say it's the same with base 10. 10^-1 == 0.1. 1/9 isn't a power of 10, you you can't have an exact representation.

[−] kccqzy 61d ago

They shouldn’t “memorize” this per se, but it should take them only a few seconds to work out in their head.

[−] fransje26 61d ago

I would argue that

    double m_D{}; [...]

    if (m_D == 0) somethingNeedsInstantiation();

can avoid having to carry around, set and check some extra m_HasValueBeenSet booleans.

Of course, it might not be something you want to overload beginner programmers with.

[−] alkonaut 61d ago

I have a linter in my code that shouts at me if I use exact equality for floats.

But I regret not making an exception for the constant zero, because it's one of the cases where you probably should accept it. I.e. if (f != 0.0) {...}

[−] vikingerik 61d ago

Zero shouldn't be an exception there. If f had been set from something like f = a - b, then you're in the same situation where f might be almost but not exactly zero.

The linter wouldn't know where f came from, so it should flag all floating point equality cases, and have some way that you can annotate it for "yeah this one is okay."

[−] alkonaut 60d ago

The thing is that

if (f == 0.0) means "is f exactly zero so it's not initialized" 99 times for every one time it means "is f zero-ish because of a cancellation/degeneracy/whatever"

I just found that I have now annotated it for "yeah this one is ok" about 100 times, and caught zero cases where I meant to do a comparison to zero-or-very-nearly-so but accidentally wrote == 0.0.

So my conclusion is: I would have had less noise in my code with that exception in the linter, and the linter had been equally useful.

[−] 1718627440 61d ago

The idea is not to do it with values derived from arithmetic, but e.g. from measurements where a real zero is very unlikely and indicates something different.

[−] vikingerik 61d ago

Right, but that's not something a linter could know.