16 May 2015

43560

## Rounding and integer math

Image:

It sometimes happens that in your program you need to divide integer values and then round the result.
Something like

`    int result = lrint((double)int_a / (double)int_b);`

But sometimes it is desirable to not use floating point math at all. The question is: Can we do the same calculation with integer math and get exactly the same result?

Let's simplify our case, by limiting to unsigned values (for signed values it is very similar, only the formulas get a bit more complex).
Let's also convert the lrint() into integer truncation:

`    #define trunc(x) ((int)(x))    unsigned int result = trunc( ((double)uint_a / (double)uint_b) + 0.5 );`

If we wanted to write this in integer math, we could do something like

`    unsigned int result = (uint_a + (uint_b/2)) / uint_b;`

But is this really the same?

Lets have a look at it in a more mathematical way:

We define 2 functions:
round(x) rounds x to the nearest natural number
trunc(x) rounds x to the next natural number towards 0

First the case where b is an even number.
Let a be a positive natural number and b be an even positive natural number
(b = 2·c)

round(ab)
= trunc((ab) + 0.5)
= trunc((a ∕ (2·c)) + 0.5)
= trunc((a ∕ (2·c)) + (0.5·(2·c) ∕ (2·c)))
= trunc((a + c) ∕ b)
= trunc((a + (b∕2)) ∕ b)

So if b is an even number, the math is right.
What about odd numbers? obviously we are truncating the b ∕ 2 inside our formula, when using intergers only.

Let a be a positive natural number and b be an odd positive natural number
b = 2·c + 1
=> b ∕ 2 = c + 0.5
=> c = b ∕ 2 - 0.5

We still have

trunc((a + (b ∕ 2)) ∕ b)
= trunc((a + ((2·c + 1) / 2)) ∕ b)
= trunc((a + c + 0.5) ∕ b)
= trunc(((a + c) ∕ b) + (0.5 ∕ b))

This is different from our integer math, which would look like this:

trunc((a + trunc(b ∕ 2)) ∕ b)
= trunc((a + trunc((2·c + 1) ∕ 2)) ∕ b)
= trunc((a + trunc(c + 0.5)) ∕ b)
= trunc((a + c) ∕ b)

But are those different?
Let's make the assumption they are different. Let us also limit the
range of a to a value between 1 and b. This can be justified, since you can always
factor out an instance of b for all a > b, with a = a0 + n·b.

Let 1 <= a < b

First some pre-investigation, that will be helpful later:

(a + c) ∕ b
= (a + (b ∕ 2 - 0.5)) ∕ b
< (a + b ∕ 2) ∕ b
= a ∕ b + 0.5

=> (a + c) ∕ b < ab + 0.5
[since a < b => a ∕ b < 1]
=> (a + c) ∕ b < 1.5

Assumption:
trunc((a + c) ∕ b) ≠ trunc((a + c) ∕ b + 0.5 ∕ b)

There are 2 cases, where this can be true:
case A: (a + c) ∕ b + 0.5 ∕ b ≥ 2
case B: (a + c) ∕ b < 1 & (a + c) ∕ b + 0.5 ∕ b > 1

case A:
(a + c) ∕ b + 0.5 ∕ b ≥ 2
[since (a + c) ∕ b < 1.5]
=> 0.5 ∕ b ≥ 0.5

case B:
(a + c) ∕ b < 1 & (a + c) ∕ b + 0.5 ∕ b > 1
=> (a + c) ∕ b < 1 & (a + c + 0.5) ∕ b > 1
=> (a + c) < b & (a + c + 0.5) > b
[since b = (2·c + 1)]
=> (a + c) < (2·c + 1) & (a + c + 0.5) > (2·c + 1)
=> a < c + 1 & a > c + 0.5

=> (c + 0.5) < a < (c + 1)
[since a is an integer]
=> (c + 1) ≤ a < (c + 1)