2

As part of a test I was given, I am required to code the instructions below:

A function F(x) is defined over integers x >= 0 as:

F(x) = ∑ [(x | i) - (x & i)] (summation ∑ is over i),

with i satisfying the following two constraints:

a. 0 <= i <= x

and

b. [(x | i) - (x & i)] = x - i

Given an integer K, the task is to determine the number of non-negative integers x such that F(x) <= K.

As an example, for K = 6, the answer is 5. The approach is the following:

  • x=0: F(0) = 0, as (0|0 - 0&0) = 0. F(0) <= K --> increment answer += 1, ans=1
  • x=1: F(1) = (1|0 - 1&0) + (1|1 - 1&1) = 1 + 0 = 1 <= K -> ans=2
  • x=2: F(2) = (2|0 - 2&0) + (2|1 - 2&1) + (2|2 - 2&2) = 2 + [3-0 != 2-1 2nd constrain on x not respected] + 0 = 2 -> ans=3
  • x=3: F(3) = 6 -> ans=4
  • x=4: F(4) = 4 -> ans=5
  • x>4: F(x) > K so ans stays =5.

So the answer for K=6 is ans = 5: there are 5 non-negative integers that respect the condition F(x) <= K = 6.

with the constraint that K be < 10^(18).

This is what I came up with, adhering with the maths of the problem as much as possible:

#include <iostream>

uint64_t computeConstraint(uint64_t n, uint64_t i) {
  return (n | i) - (n & i);
}

uint64_t computeFunction(uint64_t n) {
  uint64_t result{};
  for (uint64_t i=0; i<=n; ++i) {
    if (computeConstraint(n, i) == n - i) {
      result += n - i;
    }
  }

  return result;
}

uint64_t countValidF(uint64_t K) {
  uint64_t counter{};
  for (uint64_t n=0; n<=K; ++n) {
    if (computeFunction(n) <= K) {
      ++counter;
    }
  }

  return counter;
}

int main() {
    uint64_t K;
    std::cin >> K;
    uint64_t ans{countValidF(K)};
    std::cout << ans << '\n';
}

Godbolt link.

Now, I am given some bugged code which is purportedly a faster alternative. My problem is that I fail to identify on what algorithm this is based, so I am unable to correct it.

The task of the test was to correct the following code, in order for it to yield an implementation that can manage large (10^18) values of K faster than the brute implementation I present in the first code snippet.

The code:

#include <iostream>

uint64_t nCr[60][60];

uint64_t count(uint64_t n, uint64_t cnt, int possible_msb_pos) {
    if (n==0) {
        if (cnt==0) {
            return 1;
        }
        return 0;
    }

    uint64_t msb_pos = 0;
    for (int i=possible_msb_pos; i>0; --i) {
        if ((n & (1LL << i))) {
            msb_pos = i;
            break;
        }
    }

    if (cnt < 0) {
        return 0;
    }
    if (msb_pos < cnt) {
        return 0;
    }
    uint64_t ans = 0;
    ans += nCr[msb_pos-1][cnt];
    ans += count(n | ((1LL << (msb_pos - 1))), cnt - 1, msb_pos);

    return ans;
}


int main() {
    for (int i=0; i<60; ++i) { 
        nCr[i][0] = 1;
    }
    for (int i=1; i<60; ++i) { 
        for (int j=1; j<60; ++j) { 
            nCr[i][j] = nCr[i-1][j] + nCr[i-1][j-1];
        }
    }

    uint64_t K;
    std::cin >> K;
    uint64_t ans = 1;
    for (int bit_count=1; bit_count<60; ++bit_count) {
        ans += count( K/(1LL << (bit_count - 1)), bit_count, 60 );
    }
    std::cout << ans << '\n';
}

Godbolt link.

This second program yields an output of 3 if fed with K=6, while the first snippet of code correctly yields 5 for the same input.

It seems that the latter program calculates a table of binomial coefficients in array nCr using Pascal's triangle. There is also a fair amount of bitshifting going on. However despite these observations, I have been unable so far to discover the algorithm behind it. Hence I do not really know where to look when trying to debug it.

My hope here is if anyone versed in bitwise computations, or who might have experience with this specific problem, may be able to explain the logic behind the second code snippet I was given.

28
  • 4
    Please don't post images of text. Post actual text instead. Images can't be searched, cause problems for blind people using screen readers, can be hard to read on small screens, etc, etc. Commented Jun 12 at 11:08
  • 6
    using li = unsigned long long int; that's an immediate red flag with respect to the (lack of) quality of the source that is teaching you C++. Anyway you should use a debugger first and confirm line by line that each line of code is actually doing what you think it is doing. Commented Jun 12 at 11:10
  • 3
    fwiw, a much better shorthand for unsigned long long int is uint64_t. Commented Jun 12 at 12:16
  • 1
    I got it. It's a condition, not a helpful information. Commented Jun 12 at 12:33
  • 1
    Computer Science SE is also a good Stack Exchange for algorithm problems. Strange. I also didn't see the answer. I had to reload the page. Commented Jun 12 at 13:08

1 Answer 1

6

This is a math problem more than a programming problem.

  • First of all you should notice that (x | i) - (x & i) is just XOR and it's equal to x - i if and only if i is a bitwise subset of x.
  • The sum of such i can be reformulated into: if the b-th bit of x is set and the number of set bits (popcount) of x is k, then 2^b is added 2^(k-1) times, once for each i where the b-th bit and any of the remaining 2^(k-1) bits are set.
  • Since the sum of 2^b over set bits of x is just x, that gives F(x) = 2^(k-1) * x. This holds for the special case x = 0 too.

This formula should make the outermost loop of the code you were given obvious. It iterates over k = bit_count and count()s integers x with exactly k set bits and x <= K / (2^(k-1)).

There's a standard algorithm for handling digit-wise (e.g. bitwise) functions over a range of integers [0, N): for each digit of N (let's denote the d-th digit by N[d]), consider separately the range of integers i where i[d] < N[d], all more significant digits are the same as in N and all less significant digits may have any values. In the bitwise case, the subproblem count() can be simplified into:

For each set bit b of N = K / (2^(k-1)) + 1, count integers in the form ((N >> (b+1)) << (b+1)) + i with i < 2^b and exactly k set bits, i.e. those where the bottom b bits may be arbitrarily chosen, the next bit is 0 and the rest is equal to N.

This can be further simplified into: find the number of ways to choose b bits so that exactly k - popcount(N >> (b+1)) are set. That's the binomial coefficient C(b, k - popcount(N >> (b+1))).

The code you were given does this recursively and in a somewhat more complicated way but in the end it computes this kind of sum of binomial coefficients.

Sign up to request clarification or add additional context in comments.

15 Comments

You jumped from a sum given in a question to a sum of i without any explanation. It is worth noting that set of all x - i for a given x is the same as set of i for the same x, so the sums are the same. Or maybe the second point can be reformulated for a sum of x - i (same explanation stands).
still reading, didn't pop in my messages you had answered this nearly one hour ago ...
Help me out: a concrete example about your second bullet point - let's take eleven 11 = 1011 in binary. It has set bits k=3, and let's pick b=1 the set index for 2. The 2^b = 2 is added 2^(k-1)=4 times, in what sense? then I lose you
I misclicked publish while typing out the answer :P. For F(11) = (11 xor 0)+(11 xor 1)+..., the terms where b=1-st bit contributes 2 to the sum are (11 xor 0), (11 xor 1), (11 xor 8) and (11 xor 9). In the other 4 terms i.e. (11 xor 2), (11 xor 3), (11 xor 10), (11 xor 11) it doesn't contribute 2. As Andrey says, sum x-i is the same as sum i in this case - thanks to the condition that XOR(x,i) = x-i. Detailed steps would be rewriting as sum_i sum_b, reordering sums and simplifying the inner sum.
Because i=4 and i=5 don't satisfy the condition you gave.
I get it now, 4 and 5 are not binary subsets of 11, they are set on the 2nd index while 11 is not
I don't get it. Should probably search for a treatment of this with all equations lined up and maybe some example. Does this algorithm for handling bitwise functions have a name, so I can go and look it up somewhere?
I don't think it has a name, it's just too simple and widespread. Write down integers smaller than N in binary and notice that they all look like (e.g. N=1011 as before) in this order: 0xxx, 100x, 1010. Copied prefix, 1 changed to 0 and remaining bits freely chosen.
let's go through the second snippet of code with a concrete example. For instance let's say K=6. Then outer loop at the end of main(), calls function count() with argument_1 = (6, 3, 1, 0, 0, 0) argument_2 = (1, 2, 3, 4, 5, 6) argument_3 = (60,60,60,60,60,60) Following the first part of your answer, argument_1 should correspond to x in the problem statement. Argument_2 is k in your explanation, number of bits set in x. Argument_3 is the starting point for a search of the most significant bit in x. Then I am lost both regarding your answer, and the code I reported
No, argument 1 corresponds to N-1 in my terminology. Think about it, a function that counts x can't take x as an argument. Anyway nothing I can do about a general "I don't understand", at this point it's basic math so it'd be better if you went over it with your classmates or instructor. Note that the code you were given has several implementation bugs near the end of count(), but it's at least trying to follow the idea I described.
a big improvement to your explanation would result, if you included a step-by-step concrete example of the algorithm on some number, K=6 or K=11, say. I understand the code until the recursion inside the count() function, which is buggy on purpose. Within that recursion line, I think n should be left as is, and recursion should be on all possible number of bits set so cnt, cnt+1, etc. n should not be shifted yet (it already gets decremented in the outer loop in main())
If it's about the code you were given, you can find that by running it with specific values so there's no point in me typing it out. If it's about the description I wrote, that's not tied to a specific implementation, but a general idea, so there are no values of variables to give. You asked about the idea behind so I explained that - but if detailed, absolutely elementary step-by-step debugging of specific code is what you need now, then you have all the tools (pen, paper, code editor, terminal) to do that work yourself. It'll be good for you.
I don't need a step-by-step debugging guide. A concrete example would benefit your answer, which left as is, is very convoluted and cryptic. You write, e.g. "For each set bit b of N = K / (2^(k-1)) + 1, count integers in the form ((N >> (b+1)) << (b+1)) + i with i < 2^b and exactly k set bits, i.e. those where the bottom b bits may be arbitrarily chosen, the next bit is 0 and the rest is equal to N." Here you don't explain why you introduce N, you just define it (as the upper boundary for x, +1 it seems?). Instead of clarifying things, this further obfuscates them.
I mention N before as a generic term for upper bound of a range to split. My answer would surely be clearer if you paid attention but that's not something I can affect.
Then you link N (one paragraph before the one I cited) to x <= K / (2^(k-1)) (two paragraphs before the one I cited) and the +1 is from the fact that you exclude N from the range (one paragraph before the one I cited). Why don't you include it? You see how confusing this gets, because of the glaring omissions in your explanation?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.