Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix math.round floating point bug #14757

Merged
merged 2 commits into from
Jun 24, 2024

Conversation

kromka-chleba
Copy link
Contributor

fixes #14755

Nothing to do

This PR is Ready for Review.

How to test

Try these:

math.round(0.49999999999999994)
math.round(-0.49999999999999994)

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Jun 16, 2024

Did some tests:

Average for 10 runs of the below functions:

New: 74.313 ms
New: 75.332 ms
New: 70.944 ms

Old: 43.028 ms
Old: 52.986 ms
Old: 45.652 ms

The new one is unfortunately slightly slower, but better than the previous, buggy one?

Code:


local a = {}

for i = 1, 100000 do
    table.insert(a, math.random() + math.random(-100000000, 100000000))
end

local function test_old()
    local b = {}
    local t1 = minetest.get_us_time()
    for _, nr in pairs(a) do
        table.insert(b, old_round(nr))
    end
    return (minetest.get_us_time() - t1) / 1000
end

local function test_new()
    local c = {}
    local t1 = minetest.get_us_time()
    for _, nr in pairs(a) do
        table.insert(c, new_round(nr))
    end
    return (minetest.get_us_time() - t1) / 1000
end

local function test_avg(f, name)
    local sum = 0
    for i = 1, 10 do
        sum = sum + f()
    end
    minetest.log("error", name..": "..sum.." ms")
end

test_avg(test_old, "Old")
test_avg(test_new, "New")

@grorp
Copy link
Member

grorp commented Jun 16, 2024

This is a perfect candidate for unit testing.

https://github.com/minetest/minetest/blob/master/builtin/common/tests/misc_helpers_spec.lua

@grorp grorp added Bugfix 🐛 PRs that fix a bug @ Builtin labels Jun 16, 2024
@kromka-chleba
Copy link
Contributor Author

This is a perfect candidate for unit testing.

By this you mean the test code I wrote or something different?
All that code does is testing speed of the old and the new math.random, it doesn't check correctness.
It could be redone to test correctness too by, for example, splitting the input number into integer part and fractional part and checking if the result makes sense.

@grorp
Copy link
Member

grorp commented Jun 16, 2024

By this I mean math.round. You choose some cases, including the problematic case from the issue, and assert that math.round gives the correct result for each of them.

@appgurueu
Copy link
Contributor

appgurueu commented Jun 16, 2024

One "dumb" way to test correctness is to test against a naive rounding based on stringification (note: you need to stringify to IIRC 17 significant digits).

Then you can plug in edge cases as well as random numbers.

I agree that at least a simple unit test for the previous incorrect behavior should be added - if only to prevent a regression in the future, where someone thinks this can be simplified and optimized by using math.floor or math.ceil.

I'm fairly certain that this implementation is correct however: I believe math.modf does not lose precision. So this is an exact decomposition. The comparisons being done are also exact.

It does not suffer from the issue of the previous implementation (losing the least significant bit to floating point rounding when doing + 0.5). The only case where this implementation could "go wrong" would be the + 1 / - 1. However, this can only fail if we're at an integer with maximum exactly representable absolute value - 2^53 or -2^53. In these cases, there is no rounding to be done however: There can be no fractional places.


As for performance: I tested with the following variation of your code:

local function old_round(x)
	if x >= 0 then
		return math.floor(x + 0.5)
	end
	return math.ceil(x - 0.5)
end

local function new_round(x)
	local int, frac = math.modf(x)
	if frac >= 0.5 then
		return int + 1
	end
	if frac <= -0.5 then
		return int - 1
	end
	return int
end


local a = {}

for _ = 1, 100000 do
    table.insert(a, math.random() + math.random(-100000000, 100000000))
end

local function test_old()
    local t1 = os.clock()
    local sum = 0
    for i = 1, #a do
        sum = sum + old_round(a[i])
    end
    assert(sum ~= 123456)
    return (os.clock() - t1) * 1000
end

local function test_new()
    local t1 = os.clock()
    local sum = 0
    for i = 1, #a do
        sum = sum + new_round(a[i])
    end
    assert(sum ~= 123456)
    return (os.clock() - t1) * 1000
end

local function test_avg(f, name)
    local sum = 0
    for _ = 1, 100 do
        sum = sum + f()
    end
    print(name..": "..sum.." ms")
end

test_avg(test_old, "Old")
test_avg(test_new, "New")

This differs from your benchmark in that it aims to reduce the benchmarking overhead by using a numeric for loop and simply summing the rounded values, which should be cheaper than putting them in a table. It also uses os.clock() because we want to measure CPU time spent.

I see no need to test this in Minetest; it can be directly tested on the Lua 5.1 / LuaJIT interpreters.

On Lua 5.1, there is no notable difference on my machine.

On LuaJIT however, the latter is ~3x slower. This doesn't improve if the number of iterations is increased, suggesting this is not an unfortunate GC cycle or JIT kicking in at an unfortunate time.

@appgurueu
Copy link
Contributor

appgurueu commented Jun 16, 2024

I experimented a bit, and I think I've found a way to implement this which has comparable performance, and even performs better on PUC Lua 5.1:

function math.round(x)
	local frac = x % 1
	local int = x - frac
	return int + ((frac >= 0.5) and 1 or 0)
end

Note that since e.g. -0.7 % 1 is 0.3 in Lua, our logic can be simplified a bit. We also avoid function calls, using only operators.

This handles +/- 0.49999999999999994 correctly. It also seems to work for random numbers.

Modulo is defined as a - math.floor(a/b)*b. In this case, that boils down to x - math.floor(x).

I am not yet certain that this implementation handles all edge cases correctly, but it is promising.

We could also write this as

function math.round(x)
	local int = math.floor(x)
	local frac = x - int
	return int + ((frac >= 0.5) and 1 or 0)
end

This loses the performance edge on PUC however, due to the math.floor call. Localizing that helps, but using the operator still wins. On LuaJIT this still works well.

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Jun 16, 2024

Tried your math.random using the % operator and its behavior is slightly different from Minetest's previous implementation.
This happens because you removed the if frac <= -0.5 then condition.
I think we should keep the old behavior just without the bug.

Your new one:

> math.round(-10.5)
-10.0

Old:

> math.round(-10.5)
-11

My new:

> math.round(-10.5)
-11

Edit: I think the results should be symmetrical relative to 0, otherwise we get different results for negative vs positive numbers. Think about flipping/rotating a vector. You can get different results depending on whether you first negate and then round, or first round then negate. These produce different results and could introduce reproducibility bugs.

@appgurueu
Copy link
Contributor

appgurueu commented Jun 16, 2024

I think we should keep the old behavior just without the bug.

I agree, thanks for pointing out this edge case. Here's one fixed version:

function math.round(x)
	if x < 0 then
		local int = math.ceil(x)
		local frac = x - int
		return int - ((frac <= -0.5) and 1 or 0)
	end
	local int = math.floor(x)
	local frac = x - int
	return int + ((frac >= 0.5) and 1 or 0)
end

Unfortunately, this loses some performance on JIT. It still seems to perform better than the modf variant however:

$ luajit round.lua
Old: 119.958 ms
New (modf): 266.447 ms
New: 179.29 ms

On PUC however, it's a bit slower:

$ lua round.lua
Old: 1165.759 ms
New (modf): 1257.703 ms
New: 1510.348 ms

@sfan5
Copy link
Member

sfan5 commented Jun 16, 2024

I wanted to note two things:

  • We shouldn't spend too much time worrying about the performance on PUC Lua.
  • Since upstream Lua is not supported we could actually patch correct rounding into PUC Lua in the most optimal way.

@kromka-chleba
Copy link
Contributor Author

Updated with appgurueu's version. I'll write a unit test another day.
So far I can tell it works like my slower version and the old implementation (just without the FP bug).

@kromka-chleba
Copy link
Contributor Author

Added a bare bones unit test. Haven't tested it because was too lazy to install that busted thing.
I guess the CI is testing this now?

Otherwise I think this is ready to go.

@HybridDog
Copy link
Contributor

HybridDog commented Jun 19, 2024

I think this function shown at stackoverflow should also work for positive n:

function round(n)
  return math.floor((math.floor(n*2) + 1)/2)
end

It performs the following operations if I understand it correctly:

  • n * 2: This multiplication by two increases the exponent of the floating-point number by one, thus the operation is exact.
  • math.floor(x): This rounds down x to the closest integer and is an exact operation unless the exponent of x is very big.
  • x + 1: Here x is an integer encoded as float. This addition of one is an exact operation unless the exponent is very big since adding a power of two to a floating-point number is exact if least significant bits of the mantissa are zero.
  • x/2: This division by two decreases the exponent of the floating-point number by one, thus the operation is exact.

$ luajit -e 'x=0.49999999999999994 print(("x: %.50f\nx*2: %.50f\nmath.floor(x*2): %.50f\nx+0.5: %.50f"):format(x, x*2, math.floor(x*2), x+0.5))'
x: 0.49999999999999994448884876874217297881841659545898
x*2: 0.99999999999999988897769753748434595763683319091797
math.floor(x*2): 0.00000000000000000000000000000000000000000000000000
x+0.5: 1.00000000000000000000000000000000000000000000000000

@kromka-chleba
Copy link
Contributor Author

I think this function shown at stackoverflow should also work:

 function round(n)
   return math.floor((math.floor(n*2) + 1)/2)
 end

That would do, but

  1. it doesn't work like the previous one
  2. it's not symmetrical around 0
  3. I'm too lazy to work on this issue any further
> round(10.5)
11
> round(-10.5)
-10

Copy link
Member

@grorp grorp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unittest passes and matches the old math.round behavior expect for the last two asserts.

@Desour
Copy link
Member

Desour commented Jun 20, 2024

IIRC, out math.round was designed to do exactly the same rounding as if you'd pass e.g. an unrounded vector to core.get_node (see #6774). So it should behave like out C++ doubleToInt function (with d = 1.0):

inline v3s16 doubleToInt(v3d p, double d)
{
	return v3s16(
		(p.X + (p.X > 0 ? d / 2 : -d / 2)) / d,
		(p.Y + (p.Y > 0 ? d / 2 : -d / 2)) / d,
		(p.Z + (p.Z > 0 ? d / 2 : -d / 2)) / d);
}

@kromka-chleba
Copy link
Contributor Author

kromka-chleba commented Jun 20, 2024

IIRC, out math.round was designed to do exactly the same rounding as if you'd pass e.g. an unrounded vector to core.get_node (see #6774). So it should behave like out C++ doubleToInt function (with d = 1.0):

Does that include the buggy rounding of 0.49999999999999994 to 1 ?
Also does this mean I trash this PR or maybe instead the engine needs a bugfix?

Edit: I'm too dumb to do the C++ thing

@Desour
Copy link
Member

Desour commented Jun 21, 2024

Does that include the buggy rounding of 0.49999999999999994 to 1 ?

Looks like it. ¯\_(ツ)_/¯

@kromka-chleba
Copy link
Contributor Author

Well, if that's what your core.get_node rounding does then it has a bug.
xyz: -0.49999999999999994 is still in the (0,0,0) node, and not as the rounding implies in (-1, -1, -1).
Unless we assume that -0.49999999999999994 = -0.5 = -1, which is something I can accept after spending so much time on this trivial issue.
IMO the assumption that math.round should work like the buggy counterpart on the C++ side is wrong, because it isn't only used for node position rounding. So either fix the C++ one, or accept that the Lua one will have the proper behavior from now on.
And if someone cares enough, they can maintain their own buggy copy of math.round in their projects.

@appgurueu
Copy link
Contributor

Edit: I'm too dumb to do the C++ thing

A quick grep -r doubleToInt tells me that it's only used with d = 1.0 as Desour said:

script/common/c_converter.cpp:	return doubleToInt(pf, 1.0);
script/common/c_converter.cpp:	return doubleToInt(pf, 1.0);
script/lua_api/l_env.cpp:	v3s16 pos = doubleToInt(v3d(x, y, z), 1.0);
util/numeric.h:inline v3s16 doubleToInt(v3d p, double d)

so we can basically get rid of d. After that, shouldn't it just be std::round? It seems this rounds the way we want it to:

rounding halfway cases away from zero

So we'd get

inline v3s16 doubleToInt(v3d p)
{
	return {std::round(p.X), std::round(p.Y), std::round(p.Z)};
}

@HybridDog
Copy link
Contributor

Using math.floor(x - 0.5) + 1 and math.ceil(x + 0.5) - 1 instead of math.floor(x + 0.5) and math.ceil(x - 0.5) could be another way to fix the problem.
I think for positive x, x + 0.5 is always absolutely bigger than x whereas, if x >= 0.25, x - 0.5 is always smaller than x. If the result of an addition or subtraction of 0.5 is absolutely bigger than the input, the least significant bits of the mantissa of the input number can be lost, which should not be the case if the result is absolutely smaller.

@kromka-chleba
Copy link
Contributor Author

HybridDog, appgurueu's math.random works fine. Is there any advantage of using what you just proposed?
And I'm almost sure that doing subtraction and addition of 0.5 to 0.49999999999999994 will result in the very same bug. If you care enough you can use my unit test and check the correctness and do some benchmarks.
The only thing left to do is fixing the other C++ function, though I believe that's out of scope of this PR?

@TubberPupper
Copy link

TubberPupper commented Jun 21, 2024

if x < 0 then
  local int = math.ceil(x)
  local frac = x - int
  return int - ((frac <= -0.5) and 1 or 0)
end
local int = math.floor(x)
local frac = x - int
return int + ((frac >= 0.5) and 1 or 0)

To shave more lines, why not remove the if statement in favour of:

local int = x < 0 and math.ceil(x) or math.floor(x)
local frac = x - int
return x < 0 and int - ((frac <= -0.5) and 1 or 0) or int + ((frac >= 0.5) and 1 or 0)

A bit more complicated to read, but should be completely functional and the behaviour nonetheless the same. I haven't tested the code myself yet though

Is the if statement intentional? Seems weird to define frac and int twice in one function

@appgurueu
Copy link
Contributor

It is possible to "shorten" this, but that's not the point. I want this to be somewhat readable and performant. I think the current form achieves this well enough.

Is the if statement intentional? Seems weird to define frac and int twice in one function

These definitions do not conflict since they are in different scopes. If you prefer it, you could write it using else instead, but I personally don't like redundant branching logic (else after a return in this case).

Copy link
Contributor

@appgurueu appgurueu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that fixing the C++ doesn't have to happen in this PR.

(It's also potentially a bit trickier / more work than anticipated: doubleToInt isn't the only relevant function, there's also floatToInt. And there the d parameter is unfortunately used due to our BS BS.)

@rubenwardy rubenwardy merged commit 2885784 into minetest:master Jun 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

math.round returns 1 for 0.49999999999999994
8 participants