How to create an array of binary digits of given unsigned integer numbers with Numpy?

Question

I have an array of numbers between 0 and 3 and I want to create a 2D array of their binary digits.

in the future may be I need to have array of numbers between 0 and 7 or 0 to 15.

Currently my array is defined like this:

a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

I used numpy unpackbits function:

b = np.unpackbits(a, axis=1)

and the result is this :

array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 1, 1]], dtype=uint8)

As you can see it created a 2d array with 8 items in column while I'm looking for 2 columns 2d array.

here is my desired array:

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

Is this related to data type uint8 ?

what is your idea?

Can't you just take a slice of b? numpy.org/doc/1.18/user/… — wwii
– wwii, Commented Mar 20, 2020 at 16:36

norok2 · Accepted Answer · 2020-03-21 12:20:33Z

One way of approaching the problem is to just adapt your b to match your desired output via a simple slicing, similarly to what suggested in @GrzegorzSkibinski answer:

import numpy as np


def gen_bits_by_val(values):
    n = int(max(values)).bit_length()
    return np.unpackbits(values, axis=1)[:, -n:].copy()


print(gen_bits_by_val(a))
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

Alternatively, you could create a look-up table, similarly to what suggested in @WarrenWeckesser answer, using the following:

import numpy as np


def gen_bits_by_num(n):
    values = np.arange(2 ** n, dtype=np.uint8).reshape(-1, 1)
    return np.unpackbits(values, axis=1)[:, -n:].copy()


bits2 = gen_bits_by_num(2)
print(bits2)
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

which allows for all kind of uses thereby indicated, e.g.:

bits4 = gen_bits_by_num(4)
print(bits4[[1, 3, 12]])
# [[0 0 0 1]
#  [0 0 1 1]
#  [1 1 0 0]]

EDIT

Considering @PaulPanzer answer the line:

return np.unpackbits(values, axis=1)[:, -n:]

has been replaced with:

return np.unpackbits(values, axis=1)[:, -n:].copy()

which is more memory efficient.

It could have been replaced with:

return np.unpackbits(values << (8 - n), axis=1, count=n)

with similar effects.

Paul Panzer · Accepted Answer · 2020-03-20 21:16:53Z

1

You can use the count keyword. It cuts from the right so you also have to shift bits before applying unpackbits.

b = np.unpackbits(a<<6, axis=1, count=2)
b
# array([[0, 0],
#        [0, 1],
#        [1, 0],
#        [1, 1]], dtype=uint8)

This produces a "clean" array:

b.flags
#  C_CONTIGUOUS : True
#  F_CONTIGUOUS : False
#  OWNDATA : True
#  WRITEABLE : True
#  ALIGNED : True
#  WRITEBACKIFCOPY : False
#  UPDATEIFCOPY : False

In contrast, slicing the full 8-column output of unpackbits is in a sense a memory leak because the discarded columns will stay in memory as long as the slice lives.

edited Mar 20, 2020 at 21:16

answered Mar 20, 2020 at 19:39

Paul Panzer

53.3k3 gold badges59 silver badges103 bronze badges

Comments

Georgina Skibinski · Accepted Answer · 2020-03-20 16:56:00Z

0

You can truncate b to keep just the columns since the first column with 1:

b=b[:, int(np.argwhere(b.max(axis=0)==1)[0]):]

answered Mar 20, 2020 at 16:56

Georgina Skibinski

13.5k2 gold badges16 silver badges44 bronze badges

1 Comment

norok2 Over a year ago

You could also use int(max(a)).bit_length() to determine where to cut, which is probably faster.

Warren Weckesser · Accepted Answer · 2020-03-20 17:15:57Z

0

For such a small number of bits, you can use a lookup table.

For example, here bits2 is an array with shape (4, 2) that holds the bits of the integers 0, 1, 2, and 3. Index bits2 with the values from a to get the bits:

In [43]: bits2 = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

In [44]: a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

In [45]: bits2[a[:, 0]]
Out[45]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

This works fine for 3 or 4 bits, too:

In [46]: bits4 = np.array([[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 0, 1, 1], [0, 1, 0, 0], [
    ...: 0, 1, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 0,
    ...:  1, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 1, 1, 0], [1, 1, 1, 1]])

In [47]: bits4
Out[47]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 0, 1, 1],
       [0, 1, 0, 0],
       [0, 1, 0, 1],
       [0, 1, 1, 0],
       [0, 1, 1, 1],
       [1, 0, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 0, 1, 1],
       [1, 1, 0, 0],
       [1, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 1, 1, 1]])

In [48]: x = np.array([0, 1, 5, 14, 9, 8, 15])

In [49]: bits4[x]
Out[49]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 0, 0, 1],
       [1, 0, 0, 0],
       [1, 1, 1, 1]])

answered Mar 20, 2020 at 17:15

Warren Weckesser

116k20 gold badges207 silver badges224 bronze badges

3 Comments

norok2 Over a year ago

Sure, but it's always tedious to write one and np.arange() + np.unpackbits() + slicing is a good way of building one.

Warren Weckesser Over a year ago

"... it's always tedious..." Hmm... tedious? You do it once!

norok2 Over a year ago

Call me a weirdo but I prefer list(range(10)) over [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], even if I have to type the spelled-out version only once... or, using code from this question, I prefer to define bits4 via gen_bits_by_num() from here than the way you suggest defining bits4.

Collectives™ on Stack Overflow

How to create an array of binary digits of given unsigned integer numbers with Numpy?

4 Answers 4

EDIT

Comments

Comments

1 Comment

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

EDIT

Comments

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related