28

I discovered on my x86 VM (32 bit) that the following program:

#include <stdio.h>
void foo (long double x) {
    int y = x;
    printf("(int)%Lf = %d\n", x, y);
}
int main () {
    foo(.9999999999999999999728949456878623891498136799780L);
    foo(.999999999999999999972894945687862389149813679978L);
    return 0;
}

Produces the following output:

(int)1.000000 = 1
(int)1.000000 = 0

Ideone also produces this behavior.

What is the compiler doing to allow this to happen?

I found this constant as I was tracking down why the following program didn't produce 0 as I expected (using 19 9s produced the 0 I expected):

int main () {
    long double x = .99999999999999999999L; /* 20 9's */
    int y = x;
    printf("%d\n", y);
    return 0;
}

As I tried to compute the value at which the result switches from expected to unexpected, I arrived at the constant this question is about.

36
  • 26
    Dupe hundreds of times over. Read this. Commented May 30, 2013 at 8:37
  • 3
    A classic dupe in fact. Commented May 30, 2013 at 8:38
  • 10
    Quick guess: That literal is larger than the largest number strictly <1 representable in a double, so it being rounded to the closest double (i.e. 1.0) before anything else happens. Commented May 30, 2013 at 8:38
  • 7
    @H2CO3 I have read this document and I do not see where the question is answered. Commented May 30, 2013 at 8:55
  • 3
    @user315052: A floating point literal can either be rounded to the nearest representable value, or to the next larger or smaller representable value. It's implementation-defined which, so your implementation should document what it does. Commented May 30, 2013 at 8:57

4 Answers 4

32
+50

Your problem is that long double on your platform has insufficient precision to store the exact value 0.99999999999999999999. This means that the value of that must be converted to a representable value (this conversion happens during translation of your program, not at runtime).

This conversion can generate either the nearest representable value, or the next greater or smaller representable value. The choice is implementation-defined, so your implementation should document which it is using. It seems that your implementation uses x87-style 80bit long double, and is rounding to the nearest value, resulting in a value of 1.0 stored in x.


With the assumed format for long double (with 64 mantissa bits), the highest representable number less than 1.0 is, in hexadecimal:

0x0.ffffffffffffffff

The number exactly halfway between this value and the next higher representable number (1.0) is:

0x0.ffffffffffffffff8

Your very long constant 0.9999999999999999999728949456878623891498136799780 is equal to:

0x0.ffffffffffffffff7fffffffffffffffffffffffa1eb2f0b64cf31c113a8ec...

which should obviously be rounded down if rounding to nearest, but you appear to have reached some limit of the floating point representation your compiler is using, or a rounding bug.

Sign up to request clarification or add additional context in comments.

7 Comments

why would the trailing 0 make a difference?
Actually, the "next greater or smaller" rounding is only allowed by the standard if FLT_RADIX is NOT 2. If FLT_RADIX IS 2, then it must be rounded correctly (to nearest). Since FLT_RADIX is 2 for pretty much all machines these days (any that use IEEE binary fp), the trailing 0 issue would indicate a bug in the compiler (or at least a failure to conform to the standard).
@JohannesSchaub-litb, it's possible that the literal conversion assumes the trailing digit is a 5 rather than a 0 for rounding purposes.
@ChrisDodd: It's only hexadecimal floating constants that must be rounded correctly when FLT_RADIX is 2, the constants in the question are all decimal (at least, that is the case in C99).
@JohannesSchaub-litb: I wrote this answer before that longer constant had been added to the question. It does appear to be some kind of implementation bug.
|
5

Compiler uses binary numbers. Most compilers do the same thing.

According to wolframalpha, binary representation of

0.99999999999999999999

looks like this:

0.11111111111111111111111111111111111111111111111111111111111111111101000011000110101111011110011011011011011110111011100101000101010111011100001011010001001110001101011001010000110000101001111011111001111110000101010111111110100110000010001001101011001101010110110010010101101111101001110001100111101100000000100110110001100110000011000100100011000011110100001000000100001000101000111011010111111101011010010000010110011111110100100110001011001110100011100001111101011110101001000000111110010000101101001001010110010011001110111111100111101111100000111010001101101011000100110001010010001000100010110000101110100101010101001010100010001001100111111111001001101100000000010010001011110100101011101001001101001111001001000101011101001100111101110111111001101110100111000001111101101101101101110100100111101000000000111101101101001000111101100010101110011101110001110010110110111101000011110110100011000110101100011111111110111000010010001111000000000101100101000100101110100001001101000010110101000100011100000110010001110101...

That's 932 bits, and that STILL isn't enough to precisely represent your number (see dots at the end).

Which means that as long as your underlying platform uses base of 2 to store numbers, you will not be able to store exactly 0.99999999999999999999.

Because number cannot be stored precisely, it'll be rounded up or down. With 20 9s it ends up being rounded up, and with 19 9s it ends up being rounded down.

To avoid this problem, instead of doubles you'll need to use some kind of 3rd party mathematics/bignum library that stores numbers internally using decimal base (i.e. two decimal digits per byte or something) or uses fractions (ratios) instead of floating point numbers. That would solve your problem.

Comments

3

Double values, when there is not enough precision to represent a value, rounds up or down to the closest one. In your implementation it is rounding up to 1.

Comments

2

There are two conversions involved here. First, and in some ways most important, is the conversion of the literal .99999999999999999999L to long double. As others have said, this conversion rounds to the nearest representable value, which seems to be 1.0L. The second conversion is from the long double value that resulted from the first conversion to an integer value. That conversion rounds toward 0, which is why a quick examination suggests that the value of y should be 0. But because the first conversion produced 1 and not a value slightly less than 1, this conversion also produces 1.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.