Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect vrIsTrailingZeros? #117

Open
abolz opened this issue Jun 1, 2019 · 14 comments
Open

Incorrect vrIsTrailingZeros? #117

abolz opened this issue Jun 1, 2019 · 14 comments

Comments

@abolz
Copy link
Contributor

abolz commented Jun 1, 2019

In d2s.c, should this call

ryu/ryu/d2s.c

Line 315 in 00ecce8

vrIsTrailingZeros = multipleOfPowerOf2(mv, q - 1);
use q instead of q-1?

As I understand it, the code uses q' = max(0, q-1) to make sure that the loop below is executed at least once, which is required to get a correct value for lastRemovedDigit which in turn is required to correctly round the result. Later, the values of vrIsTrailingZeros and lastRemovedDigit are used to determine if the exact number ends in ...5000...0 here:

ryu/ryu/d2s.c

Line 378 in 00ecce8

if (vrIsTrailingZeros && lastRemovedDigit == 5 && vr % 2 == 0) {

Now, might it be possible that vrIsTrailingZeros is missing one digit, i.e.

                missed digit?
                v
            ...5?000...0
               ^ ~~~~~~^
lastRemovedDigit       vrIsTrailingZeros

I haven't found a counter-example... so it's probably me, who is missing something...

NB: the e2 >= 0 branch is using multipleOfPowerOf5(mv, q).


Related: In f2s.c,

ryu/ryu/f2s.c

Line 195 in 00ecce8

vrIsTrailingZeros = multipleOfPowerOf5(mv, q);

Shouldn't this use q-1 instead of q? In this case vrIsTrailingZeros includes the value of the last removed digit, but shouldn't it only include the digits after the last removed digit?

lastRemovedDigit
               v
            ...5000...0
               ~~~~~~~^
                      vrIsTrailingZeros

I have replaced the assignment above with

vrIsTrailingZeros = (q == 0) || multipleOfPowerOf5(mv, q - 1);

and tested all single-precision numbers: It doesn't affect the output. So, again, I'm probably missing something here...

NB: the e2 < 0 branch is using multipleOfPowerOf2(mv, q - 1) here.

@ulfjack
Copy link
Owner

ulfjack commented Jun 1, 2019

The short answer is no.

We have three values vm, vr, and vp. For vm and vp, the code checks whether all q removed digits were 0, so it needs to check if they are multiples of 10^q. For vr, the code checks whether vr ends in 500...0, where these are the last q digits, so it only checks whether there are q-1 trailing 0s, i.e., whether vr is a multiple of 10^(q-1). Of course, it then also needs to check whether the qth digit is 5, which happens below.

The other branch should also use q - 1, although it's possible that there is a reason why it doesn't, except I can't remember it. If the current code is wrong, then it should be possible to construct a case by reasoning backwards from vr / mv to obtain a bit pattern that would cause a different outcome for this branch.

@abolz
Copy link
Contributor Author

abolz commented Jun 1, 2019

We have three values vm, vr, and vp. For vm and vp, the code checks whether all q removed digits were 0, so it needs to check if they are multiples of 10^q. For vr, the code checks whether vr ends in 500...0, where these are the last q digits, so it only checks whether there are q-1 trailing 0s, i.e., whether vr is a multiple of 10^(q-1). Of course, it then also needs to check whether the qth digit is 5, which happens below.

This seems to be the case for the single-precision implementation.

But in the double-precision case, we remove one digit less (q' = q - 1), so that the loop is executed at least once to get a correct value for lastRemovedDigit. So

vr = ? ? ? ? R X r r ... r
vm = ? ? ? ? z z z z ... z
vp = ? ? ? ? w w w w ... w
             ^ ^
             q q'

Here R will be the lastRemovedDigit and we need to know whether [X r r ... r] are all zeros, i.e. whether vr is divisible by 10^(q-1), but the code uses 10^(q'-1) = 10^(q-2) and so, if all the rs are zero, but X is non-zero, vrIsTrailingZeros is wrong?

If the current code is wrong, then it should be possible to construct a case by reasoning backwards from vr / mv to obtain a bit pattern that would cause a different outcome for this branch.

I tried to do exactly that, but didn't get very far...

@abolz
Copy link
Contributor Author

abolz commented Jun 1, 2019

Ah, I should have noted that my q' is actually the same as q in the code. It might be confusing, sorry for that.

I do have a few double-precision test cases which do not correctly round if q-1 is used in the e2 >= 0 branch, i.e.

-- vrIsTrailingZeros = multipleOfPowerOf5(mv, q);
++ vrIsTrailingZeros = (q == 0) || multipleOfPowerOf5(mv, q - 1);

and the test cases:

2.0919495182368195e+19 => 2.0919495182368194e+19
2.6760179287532483e+19 => 2.6760179287532482e+19
3.2942957306323907e+19 => 3.2942957306323906e+19
3.9702293349085635e+19 => 3.9702293349085634e+19
4.0647939013152195e+19 => 4.0647939013152194e+19

@ulfjack
Copy link
Owner

ulfjack commented Jun 1, 2019

It doesn't matter what the exact value of q is, we need to use q for vm and vp, and q-1 for vr. This is 'just' the base case, we then incrementally update the flags during the loop as we remove additional digits.

Let's say we remove n digits total. When we start, we don't know what the value of n is; however, we know that n >= q. So we remove q digits, and then compute some flags. If we then remove more digits (n > q), we update the flags in each step. At least, that's what we do in the general case - note that it's rare that the flags are true after removing q digits.

I'm sure there's a reason why we use q rather than q-1 in that case, but I can't say even after looking at the code and my own paper. :-/

Since you're bringing this up, I wonder if the code would be faster if we computed the flags after we know n rather than before. I have since also found a slightly different formulation of the check for exactly 0.5-ness which might lead to slightly simpler code.

@abolz
Copy link
Contributor Author

abolz commented Jun 1, 2019

It doesn't matter what the exact value of q is, we need to use q for vm and vp, and q-1 for vr.

and we need to know the value of the last removed digit.

But in the double-precision implementation, lastRemovedDigit is initialized to 0, which might be incorrect.

The correct value of lastRemovedDigit is then computed in the first loop. q' = q - 1 is chosen exactly because we need to have the correct value of lastRemovedDigit and vrIsTrailingZeros to correctly round the result and this choice makes sure that the loop is executed at least once.

So, for

vr = ? ? ? ? R X r r ... r
vm = ? ? ? ? z z z z ... z
vp = ? ? ? ? w w w w ... w
             ^ ^
             q q'

the initial value of lastRemovedDigit is actually wrong if X != 0 (because we only test the last q' - 1 decimal digits) and consequently the value of lastRemovedDigit might be wrong in all subsequent iterations.

That being said, I'm a bit confused right now :-) and will take a look at your reasoning again tomorrow. I'm sure you are correct here.

@ulfjack
Copy link
Owner

ulfjack commented Jun 1, 2019

If we know that n > q, and the loop will run at least once, then it's correct if we check q digits for 0, rather than q - 1, otherwise the number cannot end in 500...0, because that's n - 1 >= q zeroes.

@abolz
Copy link
Contributor Author

abolz commented Jun 2, 2019

If we know that n > q, and the loop will run at least once, then it's correct if we check q digits for 0, rather than q - 1, otherwise the number cannot end in 500...0, because that's n - 1 >= q zeroes.

I'm not sure what you mean by this.

But let me make one last attempt to explaing my reasoning here.

The current implementation of the double-precision version basically works like this: (The important part here is the initialization of lastRemovedDigit and the validity of the flags before and after the first iteration.)

if (e2 >= 0) {
    vr, vm, vp = (mv, mm, mp) * 2^e2
    q_prime = log10Pow2(e2)
    q = max(0, q_prime - 1)
    vrIsTrailingZeros = multipleOfPowerOf5(mv, q)       // vr % 10^q == 0?
    vmIsTrailingZeros = multipleOfPowerOf5(mm, q)
    vpIsTrailingZeros = multipleOfPowerOf5(mp, q)
} else {
    vr, vm, vp = (mv, mm, mp) * 5^(-e2)
    q_prime = log10Pow5(-e2)
    q = max(0, q_prime - 1)
    vrIsTrailingZeros = multipleOfPowerOf2(mv, q - 1)   // vr % 10^(q - 1) == 0?
    vmIsTrailingZeros = multipleOfPowerOf2(mm, q)
    vpIsTrailingZeros = multipleOfPowerOf2(mp, q)
}

vr /= 10^q // Remove q digits.
vm /= 10^q
vp /= 10^q

vp -= vpIsTrailingZeros

lastRemovedDigit = 0

// This loop is executed at least once if q_prime > 0
while (vm / 10 < vp / 10) {
    vmIsTrailingZeros &= vm % 10 == 0
    vrIsTrailingZeros &= lastRemovedDigit == 0
    lastRemovedDigit = vr % 10
    vr /= 10
    vm /= 10
    vp /= 10
}

[round...]

If the loop is never executed, then at [round...], lastRemovedDigit = 0 and vrIsTrailingZeros = true, which are the correct values.

Otherwise,

In the e2 < 0 branch: we check whether the last q - 1 decimal digits of vr are 0 and then remove q decimal digits from vr. The actual value of the q-th digit is then lost and it will never be checked in the loop:

lastRemovedDigit gets initialized to 0, which means that in the first iteration,

  1. vrIsTrailingZeros will be unchanged and
  2. vr % 10 is assigned to lastRemovedDigit and
  3. Remove a single decimal digit from vr,vm,vp.

Now lastRemovedDigit has the correct value, but vrIsTrailingZeros basically ignores the value of the q-th removed digit. It might therefore be invalid in all subsequent iterations and particularly at [round...], where we need the correct values to properly round the result.

(Step 1. above maybe might be interpreted as switching from vrIsTrailingZeros to vrPrevIsTrailingZeros.)

If, however, the code would use q instead of q - 1 --- as the e2 >= 0 branch does --- the q-th digit would not be ignored and vrIsTrailingZeros would have the correct value after the first loop iteration.

I hope this makes some sense now...

@ulfjack
Copy link
Owner

ulfjack commented Jun 2, 2019

Ah, got you. The tests all pass when I replace q - 1 by q in this case, so clearly we don't have cases that trigger any misbehavior.

I suspect that the last digits of the decimal representation can never be 5x0000 for x != 0. I remember looking at the last few digits for a few cases, and being surprised that there were only specific patterns possible.

On the other hand, it may also be an actual bug, but if so, it must be extremely rare. There were some people who ran a few billion tests against another implementation and did not find any differences. (I only ran a few million...)

@abolz
Copy link
Contributor Author

abolz commented Jun 3, 2019

I suspect that the last digits of the decimal representation can never be 5x0000 for x != 0. I remember looking at the last few digits for a few cases, and being surprised that there were only specific patterns possible.

Ah yes, this seems to be the case here and the code seems to be working:

In the e2 < 0 branch, vrIsTrailingZeros = (mv % 2^(q-1) == 0) can only be a false positive. For this to be true, we need

mv = N * 2^q + 2^(q-1) = 2^(q-1) * (2N + 1).

and since mv = 4 * m2, we need q >= 3. It then follows that

vr = floor( mv * 5^-e2 / 10^q ) = floor( (2N + 1) 5^-(e2 + q) / 2 ).

Now every 5^k, k >= 2, can be written as 5^k = 100 x + 25, from which it follows that the product of an odd number and a power of 5 can be written as either 100 x + 25 or 100 x + 75. This means

vr = floor((100 x + 25) / 2) = 50 x + 12   or
vr = floor((100 x + 75) / 2) = 50 x + 37

i.e., the first removed digit is either 2 or 7 and, in the rounding step, either lastRemovedDigit != 5 (in which case vrIsTrailingZeros is unused) or vrIsTrailingZeros = false.

I'm not sure if this is a complete proof, but it seems that the code is working.

@ulfjack
Copy link
Owner

ulfjack commented Jun 3, 2019

Wow! Nice work! That would seem to imply that we can remove that part, IIUC?

@abolz
Copy link
Contributor Author

abolz commented Jun 5, 2019

It means that the code is working as is for mantissas of the form (2N+1) 2^(q-1). I'm not sure which part you are referring to here.

I still think we should use q instead of q-1 in the e2 < 0 branch. At least for consistency. And it would allow us to remove the (q <= 1) special case.

BTW, I'd like to know what your "different formulation of the check for exactly 0.5-ness" looks like. Ryu is so great because its so simple and so fast. Would be nice to make it even simpler :-)

ulfjack added a commit that referenced this issue Jun 5, 2019
This is not strictly necessary, because we can show correctness even with q-1, but it's more consistent.

Progress on #117.
plokhotnyuk added a commit to plokhotnyuk/jsoniter-scala that referenced this issue Jun 5, 2019
@StephanTLavavej
Copy link
Contributor

  1. Should commit 159ce15 have updated the comment immediately above,

    ryu/ryu/d2s.c

    Lines 309 to 315 in 159ce15

    } else if (q < 63) { // TODO(ulfjack): Use a tighter bound here.
    // We need to compute min(ntz(mv), pow5Factor(mv) - e2) >= q - 1
    // <=> ntz(mv) >= q - 1 && pow5Factor(mv) - e2 >= q - 1
    // <=> ntz(mv) >= q - 1 (e2 is negative and -e2 >= q)
    // <=> (mv & ((1 << (q - 1)) - 1)) == 0
    // We also need to make sure that the left shift does not overflow.
    vrIsTrailingZeros = multipleOfPowerOf2(mv, q);
    which mentions q - 1 repeatedly?

  2. Should f2s.c's analogous code,

    ryu/ryu/f2s.c

    Lines 231 to 232 in 159ce15

    } else if (q < 31) { // TODO(ulfjack): Use a tighter bound here.
    vrIsTrailingZeros = multipleOfPowerOf2(mv, q - 1);
    be changed accordingly? (Apologies if this question has been answered above.)

@abolz
Copy link
Contributor Author

abolz commented Jun 6, 2019

  1. Yes

  2. This seems to be somewhat more complicated. I think it should use q in the case where we don't explicitly compute lastRemovedDigit (in this case the code works exactly like the double-precision implementation) and otherwise it should use q-1. And the code in the e2 >= 0 branch should use these values, too. (My initial comment above was incorrect.) - However, for all possible floats, the current implementation generates the same output as the double-conversion library.

ulfjack added a commit that referenced this issue Jun 6, 2019
Progress on #117.
@plokhotnyuk
Copy link
Contributor

@abolz I bet that using of q will reduce probability of taking the slower branch with 2 cycles of rounding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants