Fix math.round floating point bug #14757

kromka-chleba · 2024-06-16T00:32:14Z

Nothing to do

This PR is Ready for Review.

How to test

Try these:

math.round(0.49999999999999994)
math.round(-0.49999999999999994)

kromka-chleba · 2024-06-16T01:04:20Z

Did some tests:

Average for 10 runs of the below functions:

New: 74.313 ms
New: 75.332 ms
New: 70.944 ms

Old: 43.028 ms
Old: 52.986 ms
Old: 45.652 ms

The new one is unfortunately slightly slower, but better than the previous, buggy one?

Code:


local a = {}

for i = 1, 100000 do
    table.insert(a, math.random() + math.random(-100000000, 100000000))
end

local function test_old()
    local b = {}
    local t1 = minetest.get_us_time()
    for _, nr in pairs(a) do
        table.insert(b, old_round(nr))
    end
    return (minetest.get_us_time() - t1) / 1000
end

local function test_new()
    local c = {}
    local t1 = minetest.get_us_time()
    for _, nr in pairs(a) do
        table.insert(c, new_round(nr))
    end
    return (minetest.get_us_time() - t1) / 1000
end

local function test_avg(f, name)
    local sum = 0
    for i = 1, 10 do
        sum = sum + f()
    end
    minetest.log("error", name..": "..sum.." ms")
end

test_avg(test_old, "Old")
test_avg(test_new, "New")

grorp · 2024-06-16T10:27:41Z

This is a perfect candidate for unit testing.

https://github.com/minetest/minetest/blob/master/builtin/common/tests/misc_helpers_spec.lua

kromka-chleba · 2024-06-16T12:06:25Z

This is a perfect candidate for unit testing.

By this you mean the test code I wrote or something different?
All that code does is testing speed of the old and the new math.random, it doesn't check correctness.
It could be redone to test correctness too by, for example, splitting the input number into integer part and fractional part and checking if the result makes sense.

grorp · 2024-06-16T12:22:49Z

By this I mean math.round. You choose some cases, including the problematic case from the issue, and assert that math.round gives the correct result for each of them.

appgurueu · 2024-06-16T14:21:03Z

One "dumb" way to test correctness is to test against a naive rounding based on stringification (note: you need to stringify to IIRC 17 significant digits).

Then you can plug in edge cases as well as random numbers.

I agree that at least a simple unit test for the previous incorrect behavior should be added - if only to prevent a regression in the future, where someone thinks this can be simplified and optimized by using math.floor or math.ceil.

I'm fairly certain that this implementation is correct however: I believe math.modf does not lose precision. So this is an exact decomposition. The comparisons being done are also exact.

It does not suffer from the issue of the previous implementation (losing the least significant bit to floating point rounding when doing + 0.5). The only case where this implementation could "go wrong" would be the + 1 / - 1. However, this can only fail if we're at an integer with maximum exactly representable absolute value - 2^53 or -2^53. In these cases, there is no rounding to be done however: There can be no fractional places.

As for performance: I tested with the following variation of your code:

local function old_round(x)
	if x >= 0 then
		return math.floor(x + 0.5)
	end
	return math.ceil(x - 0.5)
end

local function new_round(x)
	local int, frac = math.modf(x)
	if frac >= 0.5 then
		return int + 1
	end
	if frac <= -0.5 then
		return int - 1
	end
	return int
end


local a = {}

for _ = 1, 100000 do
    table.insert(a, math.random() + math.random(-100000000, 100000000))
end

local function test_old()
    local t1 = os.clock()
    local sum = 0
    for i = 1, #a do
        sum = sum + old_round(a[i])
    end
    assert(sum ~= 123456)
    return (os.clock() - t1) * 1000
end

local function test_new()
    local t1 = os.clock()
    local sum = 0
    for i = 1, #a do
        sum = sum + new_round(a[i])
    end
    assert(sum ~= 123456)
    return (os.clock() - t1) * 1000
end

local function test_avg(f, name)
    local sum = 0
    for _ = 1, 100 do
        sum = sum + f()
    end
    print(name..": "..sum.." ms")
end

test_avg(test_old, "Old")
test_avg(test_new, "New")

This differs from your benchmark in that it aims to reduce the benchmarking overhead by using a numeric for loop and simply summing the rounded values, which should be cheaper than putting them in a table. It also uses os.clock() because we want to measure CPU time spent.

I see no need to test this in Minetest; it can be directly tested on the Lua 5.1 / LuaJIT interpreters.

On Lua 5.1, there is no notable difference on my machine.

On LuaJIT however, the latter is ~3x slower. This doesn't improve if the number of iterations is increased, suggesting this is not an unfortunate GC cycle or JIT kicking in at an unfortunate time.

appgurueu · 2024-06-16T15:27:31Z

I experimented a bit, and I think I've found a way to implement this which has comparable performance, and even performs better on PUC Lua 5.1:

function math.round(x)
	local frac = x % 1
	local int = x - frac
	return int + ((frac >= 0.5) and 1 or 0)
end

Note that since e.g. -0.7 % 1 is 0.3 in Lua, our logic can be simplified a bit. We also avoid function calls, using only operators.

This handles +/- 0.49999999999999994 correctly. It also seems to work for random numbers.

Modulo is defined as a - math.floor(a/b)*b. In this case, that boils down to x - math.floor(x).

I am not yet certain that this implementation handles all edge cases correctly, but it is promising.

We could also write this as

function math.round(x)
	local int = math.floor(x)
	local frac = x - int
	return int + ((frac >= 0.5) and 1 or 0)
end

This loses the performance edge on PUC however, due to the math.floor call. Localizing that helps, but using the operator still wins. On LuaJIT this still works well.

kromka-chleba · 2024-06-16T16:00:05Z

Tried your math.random using the % operator and its behavior is slightly different from Minetest's previous implementation.
This happens because you removed the if frac <= -0.5 then condition.
I think we should keep the old behavior just without the bug.

Your new one:

> math.round(-10.5)
-10.0

Old:

> math.round(-10.5)
-11

My new:

> math.round(-10.5)
-11

Edit: I think the results should be symmetrical relative to 0, otherwise we get different results for negative vs positive numbers. Think about flipping/rotating a vector. You can get different results depending on whether you first negate and then round, or first round then negate. These produce different results and could introduce reproducibility bugs.

appgurueu · 2024-06-16T17:46:10Z

I think we should keep the old behavior just without the bug.

I agree, thanks for pointing out this edge case. Here's one fixed version:

function math.round(x)
	if x < 0 then
		local int = math.ceil(x)
		local frac = x - int
		return int - ((frac <= -0.5) and 1 or 0)
	end
	local int = math.floor(x)
	local frac = x - int
	return int + ((frac >= 0.5) and 1 or 0)
end

Unfortunately, this loses some performance on JIT. It still seems to perform better than the modf variant however:

$ luajit round.lua
Old: 119.958 ms
New (modf): 266.447 ms
New: 179.29 ms

On PUC however, it's a bit slower:

$ lua round.lua
Old: 1165.759 ms
New (modf): 1257.703 ms
New: 1510.348 ms

sfan5 · 2024-06-16T18:12:38Z

I wanted to note two things:

We shouldn't spend too much time worrying about the performance on PUC Lua.
Since upstream Lua is not supported we could actually patch correct rounding into PUC Lua in the most optimal way.

kromka-chleba · 2024-06-16T23:14:07Z

Updated with appgurueu's version. I'll write a unit test another day.
So far I can tell it works like my slower version and the old implementation (just without the FP bug).

fixes minetest#14755

kromka-chleba · 2024-06-19T19:51:53Z

Added a bare bones unit test. Haven't tested it because was too lazy to install that busted thing.
I guess the CI is testing this now?

Otherwise I think this is ready to go.

HybridDog · 2024-06-19T20:20:33Z

I think this function shown at stackoverflow should also work for positive n:

function round(n)
  return math.floor((math.floor(n*2) + 1)/2)
end

It performs the following operations if I understand it correctly:

n * 2: This multiplication by two increases the exponent of the floating-point number by one, thus the operation is exact.
math.floor(x): This rounds down x to the closest integer and is an exact operation unless the exponent of x is very big.
x + 1: Here x is an integer encoded as float. This addition of one is an exact operation unless the exponent is very big since adding a power of two to a floating-point number is exact if least significant bits of the mantissa are zero.
x/2: This division by two decreases the exponent of the floating-point number by one, thus the operation is exact.

$ luajit -e 'x=0.49999999999999994 print(("x: %.50f\nx*2: %.50f\nmath.floor(x*2): %.50f\nx+0.5: %.50f"):format(x, x*2, math.floor(x*2), x+0.5))'
x: 0.49999999999999994448884876874217297881841659545898
x*2: 0.99999999999999988897769753748434595763683319091797
math.floor(x*2): 0.00000000000000000000000000000000000000000000000000
x+0.5: 1.00000000000000000000000000000000000000000000000000

kromka-chleba · 2024-06-19T20:32:33Z

I think this function shown at stackoverflow should also work:
 function round(n)
   return math.floor((math.floor(n*2) + 1)/2)
 end

That would do, but

it doesn't work like the previous one
it's not symmetrical around 0
I'm too lazy to work on this issue any further

> round(10.5)
11
> round(-10.5)
-10

grorp

Unittest passes and matches the old math.round behavior expect for the last two asserts.

Desour · 2024-06-20T19:05:38Z

IIRC, out math.round was designed to do exactly the same rounding as if you'd pass e.g. an unrounded vector to core.get_node (see #6774). So it should behave like out C++ doubleToInt function (with d = 1.0):

inline v3s16 doubleToInt(v3d p, double d)
{
	return v3s16(
		(p.X + (p.X > 0 ? d / 2 : -d / 2)) / d,
		(p.Y + (p.Y > 0 ? d / 2 : -d / 2)) / d,
		(p.Z + (p.Z > 0 ? d / 2 : -d / 2)) / d);
}

kromka-chleba · 2024-06-20T21:08:49Z

IIRC, out math.round was designed to do exactly the same rounding as if you'd pass e.g. an unrounded vector to core.get_node (see #6774). So it should behave like out C++ doubleToInt function (with d = 1.0):

Does that include the buggy rounding of 0.49999999999999994 to 1 ?
Also does this mean I trash this PR or maybe instead the engine needs a bugfix?

Edit: I'm too dumb to do the C++ thing

Desour · 2024-06-21T07:42:21Z

Does that include the buggy rounding of 0.49999999999999994 to 1 ?

Looks like it. ¯\_(ツ)_/¯

kromka-chleba · 2024-06-21T10:26:14Z

Well, if that's what your core.get_node rounding does then it has a bug.
xyz: -0.49999999999999994 is still in the (0,0,0) node, and not as the rounding implies in (-1, -1, -1).
Unless we assume that -0.49999999999999994 = -0.5 = -1, which is something I can accept after spending so much time on this trivial issue.
IMO the assumption that math.round should work like the buggy counterpart on the C++ side is wrong, because it isn't only used for node position rounding. So either fix the C++ one, or accept that the Lua one will have the proper behavior from now on.
And if someone cares enough, they can maintain their own buggy copy of math.round in their projects.

appgurueu · 2024-06-21T10:54:28Z

Edit: I'm too dumb to do the C++ thing

A quick grep -r doubleToInt tells me that it's only used with d = 1.0 as Desour said:

script/common/c_converter.cpp:	return doubleToInt(pf, 1.0);
script/common/c_converter.cpp:	return doubleToInt(pf, 1.0);
script/lua_api/l_env.cpp:	v3s16 pos = doubleToInt(v3d(x, y, z), 1.0);
util/numeric.h:inline v3s16 doubleToInt(v3d p, double d)

so we can basically get rid of d. After that, shouldn't it just be std::round? It seems this rounds the way we want it to:

rounding halfway cases away from zero

So we'd get

inline v3s16 doubleToInt(v3d p)
{
	return {std::round(p.X), std::round(p.Y), std::round(p.Z)};
}

HybridDog · 2024-06-21T16:42:59Z

Using math.floor(x - 0.5) + 1 and math.ceil(x + 0.5) - 1 instead of math.floor(x + 0.5) and math.ceil(x - 0.5) could be another way to fix the problem.
I think for positive x, x + 0.5 is always absolutely bigger than x whereas, if x >= 0.25, x - 0.5 is always smaller than x. If the result of an addition or subtraction of 0.5 is absolutely bigger than the input, the least significant bits of the mantissa of the input number can be lost, which should not be the case if the result is absolutely smaller.

kromka-chleba · 2024-06-21T17:08:01Z

HybridDog, appgurueu's math.random works fine. Is there any advantage of using what you just proposed?
And I'm almost sure that doing subtraction and addition of 0.5 to 0.49999999999999994 will result in the very same bug. If you care enough you can use my unit test and check the correctness and do some benchmarks.
The only thing left to do is fixing the other C++ function, though I believe that's out of scope of this PR?

TubberPupper · 2024-06-21T18:39:34Z

if x < 0 then
  local int = math.ceil(x)
  local frac = x - int
  return int - ((frac <= -0.5) and 1 or 0)
end
local int = math.floor(x)
local frac = x - int
return int + ((frac >= 0.5) and 1 or 0)

To shave more lines, why not remove the if statement in favour of:

local int = x < 0 and math.ceil(x) or math.floor(x)
local frac = x - int
return x < 0 and int - ((frac <= -0.5) and 1 or 0) or int + ((frac >= 0.5) and 1 or 0)

A bit more complicated to read, but should be completely functional and the behaviour nonetheless the same. I haven't tested the code myself yet though

Is the if statement intentional? Seems weird to define frac and int twice in one function

appgurueu · 2024-06-21T18:53:48Z

It is possible to "shorten" this, but that's not the point. I want this to be somewhat readable and performant. I think the current form achieves this well enough.

Is the if statement intentional? Seems weird to define frac and int twice in one function

These definitions do not conflict since they are in different scopes. If you prefer it, you could write it using else instead, but I personally don't like redundant branching logic (else after a return in this case).

appgurueu

I agree that fixing the C++ doesn't have to happen in this PR.

(It's also potentially a bit trickier / more work than anticipated: doubleToInt isn't the only relevant function, there's also floatToInt. And there the d parameter is unfortunately used due to our BS BS.)

grorp added Bugfix 🐛 PRs that fix a bug @ Builtin labels Jun 16, 2024

kromka-chleba force-pushed the math-round-fix branch from eb1654b to 390185b Compare June 16, 2024 23:10

kromka-chleba added 2 commits June 18, 2024 00:55

Fix math.round floating point bug

7d833d3

fixes minetest#14755

Add unit tests for math.round

d693a07

kromka-chleba force-pushed the math-round-fix branch from 390185b to d693a07 Compare June 19, 2024 19:50

grorp approved these changes Jun 20, 2024

View reviewed changes

grorp added the One approval ✅ ◻️ label Jun 20, 2024

appgurueu added >= Two approvals ✅ ✅ and removed One approval ✅ ◻️ labels Jun 22, 2024

appgurueu approved these changes Jun 22, 2024

View reviewed changes

rubenwardy merged commit 2885784 into minetest:master Jun 24, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix math.round floating point bug #14757

Fix math.round floating point bug #14757

kromka-chleba commented Jun 16, 2024

kromka-chleba commented Jun 16, 2024 •

edited

Loading

grorp commented Jun 16, 2024

kromka-chleba commented Jun 16, 2024

grorp commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

kromka-chleba commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

sfan5 commented Jun 16, 2024

kromka-chleba commented Jun 16, 2024

kromka-chleba commented Jun 19, 2024

HybridDog commented Jun 19, 2024 •

edited

Loading

kromka-chleba commented Jun 19, 2024

grorp left a comment

Desour commented Jun 20, 2024

kromka-chleba commented Jun 20, 2024 •

edited

Loading

Desour commented Jun 21, 2024

kromka-chleba commented Jun 21, 2024

appgurueu commented Jun 21, 2024

HybridDog commented Jun 21, 2024

kromka-chleba commented Jun 21, 2024

TubberPupper commented Jun 21, 2024 •

edited

Loading

appgurueu commented Jun 21, 2024

appgurueu left a comment

Fix math.round floating point bug #14757

Fix math.round floating point bug #14757

Conversation

kromka-chleba commented Jun 16, 2024

Nothing to do

How to test

kromka-chleba commented Jun 16, 2024 • edited Loading

grorp commented Jun 16, 2024

kromka-chleba commented Jun 16, 2024

grorp commented Jun 16, 2024 • edited Loading

appgurueu commented Jun 16, 2024 • edited Loading

appgurueu commented Jun 16, 2024 • edited Loading

kromka-chleba commented Jun 16, 2024 • edited Loading

appgurueu commented Jun 16, 2024 • edited Loading

sfan5 commented Jun 16, 2024

kromka-chleba commented Jun 16, 2024

kromka-chleba commented Jun 19, 2024

HybridDog commented Jun 19, 2024 • edited Loading

kromka-chleba commented Jun 19, 2024

grorp left a comment

Choose a reason for hiding this comment

Desour commented Jun 20, 2024

kromka-chleba commented Jun 20, 2024 • edited Loading

Desour commented Jun 21, 2024

kromka-chleba commented Jun 21, 2024

appgurueu commented Jun 21, 2024

HybridDog commented Jun 21, 2024

kromka-chleba commented Jun 21, 2024

TubberPupper commented Jun 21, 2024 • edited Loading

appgurueu commented Jun 21, 2024

appgurueu left a comment

Choose a reason for hiding this comment

kromka-chleba commented Jun 16, 2024 •

edited

Loading

grorp commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

kromka-chleba commented Jun 16, 2024 •

edited

Loading

appgurueu commented Jun 16, 2024 •

edited

Loading

HybridDog commented Jun 19, 2024 •

edited

Loading

kromka-chleba commented Jun 20, 2024 •

edited

Loading

TubberPupper commented Jun 21, 2024 •

edited

Loading