Convert decimal range to Numpy array, with each bit being an array element

Question

I have created a small function that takes as input an integer, length, and returns a numpy array of the binary representation of all 2**length integer numbers in the range [0:2**length-1].

import numpy as np

def get_bitstrings(length):
  # We need to binary-fy 2^length numbers.
  iterations = 2**length
  # Pre-allocate memory.
  the_array = np.zeros((iterations, length))
  # Go through all decimals in the range [0:iterations-1]
  for num in range(iterations):
    # Get binary representation in string format with 'length' zeroes padded
    t_string = '{f_num:0{f_width}b}'.format(f_num=num, f_width=length)
    # Convert to a Python list
    t_list   = list(t_string)
    # Convert to Numpy array and store.
    the_array[num,:] = np.array(t_list)

  return the_array

if __name__ == '__main__':
  var1 = get_bitstrings(2)
  var2 = get_bitstrings(3)
  print('var1:\n{}\n'.format(var1))
  print('var2:\n{}\n'.format(var2))

which yields:

var1:
[[ 0.  0.]
 [ 0.  1.]
 [ 1.  0.]
 [ 1.  1.]]

var2:
[[ 0.  0.  0.]
 [ 0.  0.  1.]
 [ 0.  1.  0.]
 [ 0.  1.  1.]
 [ 1.  0.  0.]
 [ 1.  0.  1.]
 [ 1.  1.  0.]
 [ 1.  1.  1.]]

The process includes getting a binary representation of each integer number as a string (with 0s padded before it such that the length is constant at length), converting the string into a Python list, and then converting the list into a numpy array.

I found that to be the only way to satisfy the requirement that each bit is an entry in the array – i.e., bitstring 1010 is a 1x4 numpy array and not simply an integer in a 1x1 array. But I am sure there are better alternatives, hence the question.

The problem, as you can imagine, is that this is inefficient. I was wondering whether I can improve this by using Python/Numpy trickery.

Edit: I used to do this in MATLAB with this snippet:

t_length = 5; dc = [0:2^t_length-1]'; bc = rem(floor(dc*pow2(-(t_length-1):0)),2);

But I am a complete noob when it comes to Python/Numpy! Maybe it'll inspire someone. :-)

Possibly you could work something out with strides but I think even bools are represented as 8 bits in numpy... — deinonychusaur
– deinonychusaur, Commented Feb 20, 2014 at 20:13
np.arange(2 ** k) will give an actual "bitstring" (i.e. np.int) with the required binary value. What do you need this for? — Fred Foo
– Fred Foo, Commented Feb 20, 2014 at 20:41
@deinonychusaur I need the elements to be integers or floatz, not booleans. — sudosensei
– sudosensei, Commented Feb 20, 2014 at 20:49
@Iarsmans I am not sure what you mean. I need the binary representation of decimal numbers in this specific format because I'm doing linear algebra with bit-strings -- it has to do with evolutionary computation. — sudosensei
– sudosensei, Commented Feb 20, 2014 at 20:52
Have a look at numpy.binary_repr. E.g. [np.binary_repr(item, width=5) for item in range(2**5)]. This will give you strings, but you can convert to an array of ints quite easily. — Joe Kington
– Joe Kington, Commented Feb 20, 2014 at 20:55

Mark Dickinson · Accepted Answer · 2014-02-20 22:05:51Z

You can use NumPy's broadcasting and vectorized operations to do this fairly efficiently:

>>> from numpy import arange, newaxis
>>> powers_of_two = 2**arange(4)[::-1]
>>> (arange(2**4)[:, newaxis] & powers_of_two) / powers_of_two
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 0, 1, 1],
       [0, 1, 0, 0],
       [0, 1, 0, 1],
       [0, 1, 1, 0],
       [0, 1, 1, 1],
       [1, 0, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 0, 1, 1],
       [1, 1, 0, 0],
       [1, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 1, 1, 1]])

Brief explanation: we're taking all the integers from 0 to 15 (arange(2**4)), then reshaping that to given an array of shape (16, 1) (that's the [:, newaxis] slice part). Then we take the bitwise-and with powers of two, from highest to lowest (2**arange(4)[::-1]). The reshaping ensures that the bitwise and operation is performed as a sort of 'outer' operation: we take the bitwise and of every element of the original arange with every element of the powers_of_two array. This is NumPy's broadcasting and slicing at work. The absence of an explicit Python-level for loop should make this significantly faster than a solution based on for loops or list comprehensions.

Here's a somewhat sleeker, and as it turns out, faster, alternative along the same lines:

>>> from numpy import arange, newaxis
>>> arange(2**4)[:,newaxis] >> arange(4)[::-1] & 1
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 0, 1, 1],
       [0, 1, 0, 0],
       [0, 1, 0, 1],
       [0, 1, 1, 0],
       [0, 1, 1, 1],
       [1, 0, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 0, 1, 1],
       [1, 1, 0, 0],
       [1, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 1, 1, 1]])

As always, if efficiency is a concern then you should make good use of the tools that Python provides in the form of the timeit and profile modules. Timings on my machine with length=16 seem to indicate that the second variant is significantly faster than the first:

taniyama:~ mdickinson$ python -m timeit -s "from numpy import arange, newaxis" "arange(1<<16)[:, newaxis] >> arange(16)[::-1] & 1"
100 loops, best of 3: 4.08 msec per loop
taniyama:~ mdickinson$ python -m timeit -s "from numpy import arange, newaxis" "(arange(1<<16)[:, newaxis] & 2**arange(16)[::-1]) / 2**arange(16)[::-1]"
10 loops, best of 3: 21.6 msec per loop

Wow! This is incredibly fast and elegant. Thanks a lot, Mark!

Joe Kington · Accepted Answer · 2014-02-20 21:11:50Z

1

One way is to use numpy.binary_repr. It will result in a string, but you can easily convert that to an array of ints or floats (just change the dtype argument). For example:

import numpy as np

k = 4
print np.array([list(np.binary_repr(x, k)) for x in range(2**k)], dtype=int)

This yields:

Or, if you wanted a more readable version:

def bitstrings(k):
    binary = [np.binary_repr(item, width=k) for item in range(2**k)]
    return np.array([list(item) for item in binary], dtype=int)

edited Feb 20, 2014 at 21:11

answered Feb 20, 2014 at 20:59

Joe Kington

287k73 gold badges621 silver badges474 bronze badges

2 Comments

sudosensei Over a year ago

Thanks for the help, Joe, but I would rather avoid a purely iterated solution because that is what I was doing in the first place. Mark's answer comes closer to what I was asking.

Joe Kington Over a year ago

Yeah, Mark's answer is quite slick!

Collectives™ on Stack Overflow

Convert decimal range to Numpy array, with each bit being an array element

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related