why left+(right-left)/2 will not overflow?

Question

In this article: http://googleresearch.blogspot.sg/2006/06/extra-extra-read-all-about-it-nearly.html, it mentioned most quick sort algorithm had a bug (left+right)/2, and it pointed out that the solution was using left+(right-left)/2 instead of (left+right)/2. The solution was also given in question Bug in quicksort example (K&R C book)?

My question is why left+(right-left)/2 can avoid overflow? How to prove it? Thanks in advance.

Konrad Rudolph · Accepted Answer · 2014-11-27 10:20:19Z

53

You have left < right by definition.

As a consequence, right - left > 0, and furthermore left + (right - left) = right (follows from basic algebra).

And consequently left + (right - left) / 2 <= right. So no overflow can happen since every step of the operation is bounded by the value of right.

By contrast, consider the buggy expression, (left + right) / 2. left + right >= right, and since we don’t know the values of left and right, it’s entirely possible that that value overflows.

answered Nov 27, 2014 at 10:20

Konrad Rudolph

549k142 gold badges967 silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Barmar · Accepted Answer · 2014-11-27 10:22:08Z

16

Suppose (to make the example easier) the maximum integer is 100, left = 50, and right = 80. If you use the naive formula:

int mid = (left + right)/2;

the addition will result in 130, which overflows.

If you instead do:

int mid = left + (right - left)/2;

you can't overflow in (right - left) because you're subtracting a smaller number from a larger number. That always results in an even smaller number, so it can't possibly go over the maximum. E.g. 80 - 50 = 30.

And since the result is the average of left and right, it must be between them. Since these are both less than the maximum integer, anything between them is also less than the maximum, so there's no overflow.

answered Nov 27, 2014 at 10:22

Barmar

789k57 gold badges554 silver badges669 bronze badges

Comments

Jongware · Accepted Answer · 2014-11-27 10:52:08Z

8

Basic logic.

by definition left <= MAX_INT
by definition right <= MAX_INT
left+(right-left) is equal to right, which already is <= MAX_INT per #2
and so left+(right-left)/2 must also be <= MAX_INT since x/2 is always smaller than x.

Compare to the original

by definition left <= MAX_INT
by definition right <= MAX_INT
therefore left+right <= MAX_INT
and so (left+right)/2 <= MAX_INT

where statement 3 is clearly false, since left can be MAX_INT (statement 1) and so can right (statement 2).

edited Nov 27, 2014 at 10:52

answered Nov 27, 2014 at 10:20

Jongware

22.6k8 gold badges56 silver badges104 bronze badges

Comments

TripeHound · Accepted Answer · 2022-10-27 10:28:41Z

8

A simple worked example will show it. For simplicity, assume numbers overflow above 999. If we have:

left = 997
right = 999

then:

left + right = 1996

which has overflown before we get to the /2. However:

right - left = 2
(right-left)/2 = 1
left + (right-left)/2 = 997 + 1 = 998

So we've avoided the overflow.

More generally (as others have said): If both left and right are within range (and assuming right > left, then (right-left)/2 will be within range and so too must left + (right-left)/2 since this must be less than right (since you've increased left by half the gap between it and right.

edited Oct 27, 2022 at 10:28

answered Nov 27, 2014 at 10:21

TripeHound

3,02926 silver badges40 bronze badges

1 Comment

Islomkhuja Akhrarov Over a year ago

incorrect calculation left=997 right=999 left+right =1996 not 1995

Community · Accepted Answer · 2021-07-15 13:27:50Z

6

As int data type is 32 bit in Java (Assuming a programming language), any value that surpasses 32 bits gets rolled over. In numerical terms, it means that after incrementing 1 on Integer.MAX_VALUE (2147483647), the returned value will be -2147483648.

Coming to the question above lets assume the following:

int left = 1;
int right = Integer.MAX_VALUE;
int mid;

Case 1:

mid = (left +right)/2; 
//Here the value of left + right would be -2147483648 which would overflow.

Case 2:

mid = left + (right - left)/2;
//This would not have the same problem as above as the value would never exceed "right".

In theory:

Both the values are same as left + (right - left)/2 = (2*left + right - left)/2 = (left + right)/2

Hope this answers your question.

edited Jul 15, 2021 at 13:27

CommunityBot

11 silver badge

answered May 15, 2019 at 4:13

Tarun Kolla

1,0241 gold badge13 silver badges32 bronze badges

Comments

rand · Accepted Answer · 2014-11-27 10:27:36Z

2

(This is more an intuitive explanation than a proof.)

Assume your data is unsigned char, and left = 100 and right = 255 (so right as at the edge of the range). If you do left + right, you'll get 355, which does not fit the unsigned char range, so it will overflow.

However, (right-left)/2 is a quantity X such that left + X < right < MAX, where MAX is 255 for unsigned char. This way, you can be sure that the sum can never overflow.

answered Nov 27, 2014 at 10:27

rand

6985 silver badges13 bronze badges

Comments

Birhanu · Accepted Answer · 2022-12-12 02:49:15Z

0

Why not m = (l - r) / 2? Since we do not need already traversed indexes where from the start to the current left?

answered Dec 12, 2022 at 2:49

Birhanu

15 bronze badges

1 Comment

Luciana Oliveira Over a year ago

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

yupe · Accepted Answer · 2024-06-24 15:20:31Z

About the question itself, the former answers have explained it clearly. But when I tried to figure out its operation mechanism, I found something interesting.

The new question is what will happen when the code mid = left + right - left is running, will it do add first and then sub? If so, whether it'll be overflow in the process?, will the result be infected?

The answer is whether add first sub second depends on compiler, it'll be overflow in the process if it do so, and the result won't be infected.

Test Code 1:

int square() {
    int mid, left = 2147483647, right = 2147483647;
    mid = left + right - left;
    return mid;
}

After x86-64 Clang 18.1.0 compiled:

square:                                 # @square
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 8], 2147483647
        mov     dword ptr [rbp - 12], 2147483647
        mov     eax, dword ptr [rbp - 8]
        add     eax, dword ptr [rbp - 12] # add first (eax = -2)
        sub     eax, dword ptr [rbp - 8]  # sub second (eax = 2147483647) 
        mov     dword ptr [rbp - 4], eax
        mov     eax, dword ptr [rbp - 4]
        pop     rbp
        ret

After x86-64 gcc 14.1 compiled

square:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], 2147483647
        mov     DWORD PTR [rbp-8], 2147483647
        mov     eax, DWORD PTR [rbp-8]
        mov     DWORD PTR [rbp-12], eax
        mov     eax, DWORD PTR [rbp-12]  # it does't even do the simple math totally (optimized)
        pop     rbp
        ret

After loongarch64 gcc 14.1.0 compiled

square:
        addi.d  $r3,$r3,-32
        st.d    $r22,$r3,24
        addi.d  $r22,$r3,32
        lu12i.w $r12,2147479552>>12                 # 0x7ffff000
        ori     $r12,$r12,4095
        st.w    $r12,$r22,-20
        lu12i.w $r12,2147479552>>12                 # 0x7ffff000
        ori     $r12,$r12,4095
        st.w    $r12,$r22,-24
        ld.w    $r12,$r22,-24   # first
        st.w    $r12,$r22,-28   # second (same like gcc too, optimized)
        ldptr.w $r12,$r22,-28
        or      $r4,$r12,$r0
        ld.d    $r22,$r3,24
        addi.d  $r3,$r3,32
        jr      $r1

So, the conclusion is although the process maybe overflowed, the result isn't infected totally(Note don't confuse it with left + right, which will indeed terminate your running)

Back to the left+(right-left)/2, according to the assembly code produced by clang, it'll do (right-left) first, then the division /, and finally the add +.

Test Code 2:

int square() {
    int mid, left = 2147483647, right = 2147483647;
    mid =  left + (right - left) / 2 ;
    return mid;
}

square:                                 # @square
        push    rbp
        mov     rbp, rsp
        mov     dword ptr [rbp - 8], 2147483647
        mov     dword ptr [rbp - 12], 2147483647
        mov     eax, dword ptr [rbp - 8]
        mov     dword ptr [rbp - 16], eax # 4-byte Spill
        mov     eax, dword ptr [rbp - 12]
        sub     eax, dword ptr [rbp - 8] # "sub first"
        mov     ecx, 2
        cdq
        idiv    ecx # "division second"
        mov     ecx, eax
        mov     eax, dword ptr [rbp - 16] # 4-byte Reload
        add     eax, ecx # "add last"
        mov     dword ptr [rbp - 4], eax
        mov     eax, dword ptr [rbp - 4]
        pop     rbp
        ret

Disclaimer: the assembly code is from Compilers, the answer is just for fun.

Collectives™ on Stack Overflow

why left+(right-left)/2 will not overflow?

8 Answers 8

Comments

Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Comments

Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related