1

Calculating this value for an long is easy:

It is simply 2 to the power of n-1, and than minus 1. n is the number of bits in the type. For a long this is defined as 64 bits. Because we must use represent negative numbers as well, we use n-1 instead of n. Because 0 must be accounted for, we subtract 1. So the maximum value is:

MAX = 2^(n-1)-1

what it the equivalent thought process, for a double:

Double.MAX_VALUE

comes to be

1.7976931348623157E308
5
  • 6
    Google floating point representation, making special note of radix, exponent and mantissa. Those three things will define your thought process. Commented Sep 13, 2013 at 12:19
  • en.wikipedia.org/wiki/… Commented Sep 13, 2013 at 12:20
  • possible duplicate of How to Calculate Double + Float Precision Commented Sep 13, 2013 at 12:21
  • Note that 1.7976931348623157E308 is for 64bits, 32 bits would be something like ~3.4e38 Commented Sep 13, 2013 at 12:23
  • That is by definition what a Java Double is - 64 bits. Double.MAX_VALUE. Commented Sep 13, 2013 at 12:24

4 Answers 4

4

The maximum finite value for a double is, in hexadecimal format, 0x1.fffffffffffffp1023, representing the product of a number just below 2 (1.ff… in hexadecimal notation) by 21023. When written this way, is is easy to see that it is made of the largest possible significand and the largest possible exponent, in a way very similar to the way you build the largest possible long in your question.


If you want a formula where all numbers are written in the decimal notation, here is one:

Double.MAX_VALUE = (2 - 1/252) * 21023

Or if you prefer a formula that makes it clear that Double.MAX_VALUE is an integer:

Double.MAX_VALUE = 21024 - 2971

Sign up to request clarification or add additional context in comments.

2 Comments

@Bill-TheButcher-Cutting I have added a link to blogs.oracle.com/darcy/entry/… where an explanation is offered.
@Bill-TheButcher-Cutting It is actually focused on binary, because the floating-point format is binary. Hexadecimal is just a way to avoid the manipulation of strings of 53 digits. The same applies to your explanation of “MAX = 2^(n-1)-1” for integers: your own explanation is focused on binary. It does not make sense in decimal, because integer types are not represented in decimal internally.
0

If we look at the representation provided by Oracle:

0x1.fffffffffffffp1023

or

(2-2^-52)·2^1023

We can see that

fffffffffffff

is 13 hexadecimal digits that can be represented as 52 binary digits ( 13 * 4 ).

If each is set to 1 as it is ( F = 1111 ), we obtain the maximum fractional part.

The fractional part is always 52 bits as defined by

http://en.wikipedia.org/wiki/Double-precision_floating-point_format

1 bit is for sign

and the remaining 11 bits make up the exponent.

Because the exponent must be both positive and negative and it must represent 0, it to can have a maximum value of:

2^10 - 1

or

1023

1 Comment

“Because the exponent must be both positive and negative and it must represent 0, it to can have a maximum value of” This is probably the wrong way to explain how to decode the exponent of a floating-point numbers. The ranges available for positive and negative values do not have to be of the same width. Two values are reserved (for subnormals and for Nan/Inf), and indeed, the values were reserved to the detriment of the negative range, which only goes to 1022. It is coincidence that the maximum exponent of a finite value corresponds to the maximum value of a 10-bit integer in two's complement
0

Doubles (and floats) are represented internally as binary fractions according to the IEEE standard 754 and can therefore not represent decimal fractions exactly:

So there is no equivalent calculation.

2 Comments

Right, that's simply the maximum fraction, but the values stored internally aren't necessarily precise integers.
@Dan All the very large doubles, including Double.MAX_VALUE, are exact integers. Every finite double number is exactly representable as a decimal fraction. Of course, it does not work the other way round. Many short, simple decimal fractions are not exactly representable in any binary fraction format.
0

Just take a look at the documentation. Basically, the MAX_VALUE computation for Double uses a different formula because of the finite number of real numbers that can be represented on 64 bits. For an extensive justification, you can consult this article about data representation.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.