This made me realize that trigonometric functions are not deterministic across d...

retrac · on March 7, 2024

Rounding transcendentals correctly has unknown time and space complexity. [1] Sort of brushes up against the halting problem. With limited precision, the upper limit becomes calculable but it's rather large - packages that offer correct rounding on 64-bit floating point use potentially hundreds of bytes to deal with a single floating point value. Dedicated circuitry to implement it fast would be big and complicated even by today's standards.

https://en.wikipedia.org/wiki/Rounding#Table-maker's_dilemma

lifthrasiir · on March 8, 2024

True in general, almost false for binary64. Important univariate functions have been thoroughly mapped for known hard-to-round cases, which resulted in the fact that we only need at most triple-double format to do the correct rounding. (Bivariate functions like `pow` are much harder, and not yet fully mapped as of this writing.) As a result we now have a mathematical library that almost ensures correct rounding [1], and a further optimization is currently in progress.

[1] https://core-math.gitlabpages.inria.fr/

throwawaymaths · on March 8, 2024

unknown time and space complexity

True in general but for the basic datatypes sent through hardware regusters your processor architecture has fixed precision. So the time and space complexity is O(1)

ImHereToVote · on March 8, 2024

We should just use analog circuit for this sort of thing where the exact precision doesn't matter.

Arelius · on March 8, 2024

Yeah, but you'd be surprised at how frequently they appear to be the same. I once worked on a HTML5 game that that relied on deterministic simulation for networking. And it wasn't untill pretty late in development that a build of Chrome shipped on some platform that finally triggered a desync in our extensive cross-platform test suite.

We implemented a deterministic approximation, and moved on. But I learned something important about trig functions that day.

cogman10 · on March 7, 2024

Well, what's fun is that (AFAIK) trigonometric functions tend not to be implemented in the newer floating point instructions, such as AVX or SSE.

So while what you say is true about the x87 implementation of those functions, for anything targeting a machine built in the last 20 years it's likely the code will run consistently regardless the architecture (barring architecture floating point bugs, which aren't terribly uncommon in the less significant bits and when overclocking comes into play).

x86 compilers won't use x87 instructions when SSE2 and later are available. x87 is just a really weird and funky instruction set that's best left in the gutter of history.

bnprks · on March 7, 2024

Sadly even SSE vs. AVX is enough to often give different results, as SSE doesn't have support for fused multiply-add instructions which allow calculation of a*b + c with guaranteed correct rounding. Even though this should allow CPUs from 2013 and later to all use FMA, gcc/clang don't enable AVX by default for the x86-64 targets. And even if they did, results are only guaranteed identical if implementations have chosen the exact same polynomial approximation method and no compiler optimizations alter the instruction sequence.

Unfortunately, floating point results will probably continue to differ across platforms for the foreseeable future.

cogman10 · on March 7, 2024

That's a bit of a different problem IMO.

Barring someone doing a "check if AVX is available" check inside their code, binaries are generally compiled targeting either SSE or AVX and not both. You can reasonably expect that the same binary thrown against multiple architectures will have the same output.

This, of course, doesn't apply if we are talking about a JIT. All bets are off if you are talking about javascript or the JVM.

That is to say, you can expect that a C++ binary blob from the Ubuntu repo is going to get the same numbers regardless the machine since they generally will target fairly old architectures.

zokier · on March 7, 2024

> Barring someone doing a "check if AVX is available" check inside their code

Afaik that is exactly what glibc does internally

gpderetta · on March 7, 2024

GCC won't use FMA without fast-math though. Even when AVX is otherwise enabled.

zokier · on March 7, 2024

Sure it will:

> -ffp-contract=fast enables floating-point expression contraction such as forming of fused multiply-add operations if the target has native support for them

> The default is -ffp-contract=off for C in a standards compliant mode (-std=c11 or similar), -ffp-contract=fast otherwise.

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#ind...

gpderetta · on March 7, 2024

Oh, wow, forgot about fp-contract. It says it is off in C by default, what about C++?

zokier · on March 7, 2024

Read closer, it defaults to fast, not off

planede · on March 8, 2024

I would have expected to be a bug in the documentation? Why would they turn FMA off for standard compliant C mode, but not for standard compliant C++ mode?

But the documentation does appear to be correct: https://godbolt.org/z/3bvP136oc

Crazy.

gpderetta · on March 8, 2024

it defaults to off for standard-compliant mode. Which in my mind was the default mode as that's what we use everywhere I have worked in the last 15 years. But of course that's not the case.

In any case, according to the sibling comment, the default is 'fast' even in std-compliant mode in C++, which I find very surprising. I'm not very familiar with that corner of the standard, but it must be looser than the equivalent wording in the C standard.

enriquto · on March 7, 2024

> x87 is just a really weird and funky instruction set that's best left in the gutter of history

hmmm, can you use the long doubles in sse or avx? They are glorious, and as far as I see from playing with godbolt, they still require dirtying your hands with the x87 stack.

cogman10 · on March 7, 2024

The 80bit float? Not as far as I'm aware. However, it's fairly trivial to represent a 127bit float with 2 64bit floats. And with the nature of AVX/SSE, you don't really take much of a performance hit for doing that as you are often operating on both parts of the double with the same instruction.

enriquto · on March 7, 2024

Do you know if there's language support for that? Are there obscure gcc options that make "long double" be quadruple precision floats?

zokier · on March 7, 2024

You can just use standard C _Float128 type https://gcc.gnu.org/onlinedocs/gcc/Floating-Types.html

cogman10 · on March 7, 2024

Which language?

For C++, there's this: https://en.cppreference.com/w/cpp/types/floating-point

bee_rider · on March 7, 2024

They do, however, have some intrinsics for trig functions in AVX in their compilers. Not as good as having an instruction of course.

fulafel · on March 8, 2024

What about GPU ISAs?

paulddraper · on March 8, 2024

> deterministic

I would use the word "consistent."

Non-determinism implies randomness.

TylerE · on March 7, 2024

Safer to assume that floats are never deterministic.

jacobolus · on March 7, 2024

Floats follow a clear specification which determines precisely how basic arithmetic should work. They should work the same on all popular modern platforms. (Whether specific software libraries are the same is a separate question.)

zokier · on March 7, 2024

But transcendentals like sine are not part of the strictly defined basic arithmetic; they are intentionally defined with relaxed behavior.

jacobolus · on March 7, 2024

If you implement sine in software using the same sequence of basic arithmetic instructions, the result should be the same across platforms. If you make two different implementations using different arithmetic, then of course you can't rely on them being the same.

zokier · on March 7, 2024

Point being that IEEE 754 defines two sets of operations, the required operations (section 5) that should be produce correctly rounded results to the last digit, and recommend operations (section 9) with relaxed requirements. And sine belongs to the latter section, so IEEE 754 does not mandate reproducible results for sine.

jacobolus · on March 7, 2024

My understanding is that most software always uses some software implementation of sine, rather than calling a hardware instruction. Which is definitely what you should do if you care about getting the exact same results across platforms.

dzaima · on March 8, 2024

Software implementations can and do differ (even dynamically) based on the hardware though - e.g. glibc's sin(x) function, what C code will end up using (if not other languages relying on the C stdlib), uses FMA instructions on my CPU, and thus the exact same binary on the exact same OS with the exact same glibc should behave differently on a very old CPU without FMA where it should have a different implementation (as generally things using FMA cannot be exactly ported to hardware without it without a gigantic drop in performance which'd be extremely unacceptable).

jacobolus · on March 8, 2024

Yeah, the lack of FMA in some contexts is a serious bummer. It would be great if every popular CPU platform would figure out a way to get FMA implemented, and if programming languages would figure out better ways to help programmers use it explicitly without making their code too ugly.

adgjlsfhk1 · on March 9, 2024

at this point, Intel, AMD, Arm, and RiscV all do and have for a while. The only one that is at all relevant that doesn't is Apple m-series under rosetta.

gpderetta · on March 7, 2024

Even there many standard libraries provide very good precision at least within sane domain.

TylerE · on March 7, 2024

Yeah, that’s kind of my point. 99% consistent isn’t.

saagarjha · on March 7, 2024

This is not always a safe assumption (in certain scenarios floating point results being nondeterministic has the possibility to introduce bugs and security issues) and is also a kind of sad way to look at the world. The response to "I don't understand how this works" should not be to adopt an incorrect viewpoint, but to know the limitations of your understanding.

TylerE · on March 7, 2024

It’s not that I don’t understand, it’s that I do. Floats are inherently lossy representations. Yes, this means the more operations you perform on a float input, the fuzzier the value is.You ignore that harsh reality at your peril. If you find engineering rigor sad, I don’t know what to tell you.

saagarjha · on March 7, 2024

"Floats are not deterministic" is not engineering rigor, it's just wrong. They are specified precisely by IEEE-754 in how they must behave and which operations are allowed to produce which results.

TylerE · on March 7, 2024

IEEE 754 conforming floats conform to IEEE-754. If they actually conform. Low end devices with shitty software implementations often get the hard edge cases wrong.

saagarjha · on March 7, 2024

Yes and when that happens it is important to know what went wrong rather than just handwaving "oh it's that float stuff again I can't trust it".

jacobolus · on March 7, 2024

> the more operations you perform on a float input, the fuzzier the value is

No, any float always precisely represents a specific number. The issue is that only a finite number of numbers are representable.

Some algorithms are poorly conditioned and when implemented using floating point arithmetic will lead to a result that is different than what you would get in idealized real number arithmetic. That doesn't make any floating point value "fuzzy".

aidenn0 · on March 8, 2024

> No, any float always precisely represents a specific number. The issue is that only a finite number of numbers are representable.

A float always precisely represents a specific number, but that number is not always precisely the equal to the algebraic result of the operations performed (even when ignoring transcendental and irrational functions). This should be obvious since there is no limit to rational numbers, but finite floating point numbers.

If you design your algorithms very carefully, you can end up with the ratio of the output of your algorithm to the ratio of the algebraic result close to unity over a wide domain of inputs.

otabdeveloper4 · on March 8, 2024

Irrational numbers cannot have an exact representation in digital bits. (Computers use rational numbers with modular arithmetic under the hood.)

saagarjha · on March 8, 2024

Yes I am aware of this fact

throwway120385 · on March 7, 2024

It depends. If you're constrained to one chip and one platform you can characterize or you can estimate the characteristics of a float that matter in your application. In some applications like embedded that's actually totally fine, and modern embedded chips can often do floating point as fast or faster than they can emulate fixed point to work around floating point's drawbacks. On one project I worked on they originally wrote everything fixed point out of fear that floating point would introduce some deleterious effect. But in the end they rewrote parts of the project using floating point to no ill effect and great performance improvement. And there were features of the product that they had to strike because the rewrite needed to support them couldn't touch certain sensitive areas of the code that had been tested extensively in the 2 or 3 years of development. It would have been much better to evaluate the assumption that floats are bad early on in the project and make the decision based on real information. The heuristic they were applying ended up costing part of the product that was strategically important.

Jyaif · on March 7, 2024

> constrained to one chip and one platform

and constrained to one compiler at a precise version, and one set of compiler options

aardvark179 · on March 7, 2024

Floats are well defined, and it is perfectly possible to reason about how algorithms based on them should behave. Few languages specify the accuracy of things like trig functions, so relying on them can be tricky, and JavaScript is particularly bad in that respect.

mhh__ · on March 7, 2024

They're always deterministic in some sense (and as long as your OS respects the rounding mode after a context switch properly). This might sound pedantic but it determines how we think about floats — the behaviour is specified quite exactly.

kens · on March 8, 2024

Curiously, early Intel 386 processors had a bug where 32-bit multiplies were genuinely nondeterministic: some answers would depend on the voltage, frequency, temperature, and manufacturing conditions. The problem was essentially analog, a layout issue, where the signal didn't always have enough margin. (This is unrelated to the famous Pentium FDIV bug.) Until Intel got the problem fixed, they would stamp bad chips with "16 BIT S/W ONLY", while good chips were labeled "ΣΣ".

TylerE · on March 7, 2024

What I mean is that the same code running on different hardware/os may not always give the same answer. It’ll be close, but you can’t always expect bit for bit identical.

ducttapecrown · on March 7, 2024

They're always deterministic, just as long as physics is.

charlieyu1 · on March 8, 2024

But why? We all know 0.1+0.2 won’t give 0.3 with floats but at least we should expect deterministic result for same numbers and same operations and same order, no?

contravariant · on March 7, 2024

I don't think that's safe at all. Catastrophic cancellation would be quite a lot less catastrophic if rounding errors were random but accurate on average.

aardvark179 · on March 7, 2024

Somewhat annoyingly the ascribe standard only specifies that various math functions return an approximation but does not set any bounds on that approximation. So for many functions you could just return NaN and still be compliant.

layer8 · on March 7, 2024

Isn’t NaN the one value that can’t possibly count as an approximation, because it’s not a number and unordered? ;)

aardvark179 · on March 7, 2024

You might think so, but if it’s not specified in the standard…

zardo · on March 7, 2024

It is the worst possible approximation though.

layer8 · on March 7, 2024

I was submitting an interpretation of the standard.

adgjlsfhk1 · on March 7, 2024

some languages (e.g. Julia) provide their own math library do that you get the same results across across operating systems.

quickthrower2 · on March 8, 2024

So you can use sin(x) for various x to tell what you are running on. Maybe even in the browser?

Karliss · on March 8, 2024

There is a shader which uses similar idea to detect what kind of GPU you have https://www.shadertoy.com/view/slySWV

lifthrasiir · on March 8, 2024

V8 and SpiderMonkey have converged to the same underlying library (fdlibm), partly for the interoperability, so you generally can't.