Why does adding 0 to the end of float literal change how it rounds (possible GCC bug)?

Question

I discovered on my x86 VM (32 bit) that the following program:

#include <stdio.h>
void foo (long double x) {
    int y = x;
    printf("(int)%Lf = %d\n", x, y);
}
int main () {
    foo(.9999999999999999999728949456878623891498136799780L);
    foo(.999999999999999999972894945687862389149813679978L);
    return 0;
}

Produces the following output:

(int)1.000000 = 1
(int)1.000000 = 0

Ideone also produces this behavior.

What is the compiler doing to allow this to happen?

I found this constant as I was tracking down why the following program didn't produce 0 as I expected (using 19 9s produced the 0 I expected):

int main () {
    long double x = .99999999999999999999L; /* 20 9's */
    int y = x;
    printf("%d\n", y);
    return 0;
}

As I tried to compute the value at which the result switches from expected to unexpected, I arrived at the constant this question is about.

Quick guess: That literal is larger than the largest number strictly <1 representable in a double, so it being rounded to the closest double (i.e. 1.0) before anything else happens. — BoBTFish
– BoBTFish, Commented May 30, 2013 at 8:38
@H2CO3 I have read this document and I do not see where the question is answered. — Pascal Cuoq
– Pascal Cuoq, Commented May 30, 2013 at 8:55
@user315052: A floating point literal can either be rounded to the nearest representable value, or to the next larger or smaller representable value. It's implementation-defined which, so your implementation should document what it does. — caf
– caf, Commented May 30, 2013 at 8:57

caf · Accepted Answer · 2013-05-30 22:14:43Z

32

+50

Your problem is that long double on your platform has insufficient precision to store the exact value 0.99999999999999999999. This means that the value of that must be converted to a representable value (this conversion happens during translation of your program, not at runtime).

This conversion can generate either the nearest representable value, or the next greater or smaller representable value. The choice is implementation-defined, so your implementation should document which it is using. It seems that your implementation uses x87-style 80bit long double, and is rounding to the nearest value, resulting in a value of 1.0 stored in x.

With the assumed format for long double (with 64 mantissa bits), the highest representable number less than 1.0 is, in hexadecimal:

0x0.ffffffffffffffff

The number exactly halfway between this value and the next higher representable number (1.0) is:

0x0.ffffffffffffffff8

Your very long constant 0.9999999999999999999728949456878623891498136799780 is equal to:

0x0.ffffffffffffffff7fffffffffffffffffffffffa1eb2f0b64cf31c113a8ec...

which should obviously be rounded down if rounding to nearest, but you appear to have reached some limit of the floating point representation your compiler is using, or a rounding bug.

edited May 30, 2013 at 22:14

answered May 30, 2013 at 9:41

caf

241k42 gold badges343 silver badges479 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Johannes Schaub - litb Over a year ago

why would the trailing 0 make a difference?

Chris Dodd Over a year ago

Actually, the "next greater or smaller" rounding is only allowed by the standard if FLT_RADIX is NOT 2. If FLT_RADIX IS 2, then it must be rounded correctly (to nearest). Since FLT_RADIX is 2 for pretty much all machines these days (any that use IEEE binary fp), the trailing 0 issue would indicate a bug in the compiler (or at least a failure to conform to the standard).

Mark Ransom Over a year ago

@JohannesSchaub-litb, it's possible that the literal conversion assumes the trailing digit is a 5 rather than a 0 for rounding purposes.

caf Over a year ago

@ChrisDodd: It's only hexadecimal floating constants that must be rounded correctly when FLT_RADIX is 2, the constants in the question are all decimal (at least, that is the case in C99).

caf Over a year ago

@JohannesSchaub-litb: I wrote this answer before that longer constant had been added to the question. It does appear to be some kind of implementation bug.

|

SigTerm · Accepted Answer · 2013-05-30 18:40:50Z

Compiler uses binary numbers. Most compilers do the same thing.

According to wolframalpha, binary representation of

0.99999999999999999999

looks like this:

0.11111111111111111111111111111111111111111111111111111111111111111101000011000110101111011110011011011011011110111011100101000101010111011100001011010001001110001101011001010000110000101001111011111001111110000101010111111110100110000010001001101011001101010110110010010101101111101001110001100111101100000000100110110001100110000011000100100011000011110100001000000100001000101000111011010111111101011010010000010110011111110100100110001011001110100011100001111101011110101001000000111110010000101101001001010110010011001110111111100111101111100000111010001101101011000100110001010010001000100010110000101110100101010101001010100010001001100111111111001001101100000000010010001011110100101011101001001101001111001001000101011101001100111101110111111001101110100111000001111101101101101101110100100111101000000000111101101101001000111101100010101110011101110001110010110110111101000011110110100011000110101100011111111110111000010010001111000000000101100101000100101110100001001101000010110101000100011100000110010001110101...

That's 932 bits, and that STILL isn't enough to precisely represent your number (see dots at the end).

Which means that as long as your underlying platform uses base of 2 to store numbers, you will not be able to store exactly 0.99999999999999999999.

Because number cannot be stored precisely, it'll be rounded up or down. With 20 9s it ends up being rounded up, and with 19 9s it ends up being rounded down.

To avoid this problem, instead of doubles you'll need to use some kind of 3rd party mathematics/bignum library that stores numbers internally using decimal base (i.e. two decimal digits per byte or something) or uses fractions (ratios) instead of floating point numbers. That would solve your problem.

Wolf · Accepted Answer · 2013-05-30 12:21:30Z

3

Double values, when there is not enough precision to represent a value, rounds up or down to the closest one. In your implementation it is rounding up to 1.

answered May 30, 2013 at 12:21

Wolf

1192 silver badges12 bronze badges

Comments

Pascal Cuoq · Accepted Answer · 2013-05-30 17:57:48Z

2

There are two conversions involved here. First, and in some ways most important, is the conversion of the literal .99999999999999999999L to long double. As others have said, this conversion rounds to the nearest representable value, which seems to be 1.0L. The second conversion is from the long double value that resulted from the first conversion to an integer value. That conversion rounds toward 0, which is why a quick examination suggests that the value of y should be 0. But because the first conversion produced 1 and not a value slightly less than 1, this conversion also produces 1.

edited May 30, 2013 at 17:57

Pascal Cuoq

80.6k8 gold badges168 silver badges293 bronze badges

answered May 30, 2013 at 17:56

Pete Becker

77.1k8 gold badges82 silver badges171 bronze badges

Collectives™ on Stack Overflow

Why does adding 0 to the end of float literal change how it rounds (possible GCC bug)?

4 Answers 4

7 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

7 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related