1

I have an array of numbers between 0 and 3 and I want to create a 2D array of their binary digits.

in the future may be I need to have array of numbers between 0 and 7 or 0 to 15.

Currently my array is defined like this:

a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

I used numpy unpackbits function:

b = np.unpackbits(a, axis=1)

and the result is this :

array([[0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 0, 1, 1]], dtype=uint8)

As you can see it created a 2d array with 8 items in column while I'm looking for 2 columns 2d array.

here is my desired array:

array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

Is this related to data type uint8 ?

what is your idea?

1

4 Answers 4

1

One way of approaching the problem is to just adapt your b to match your desired output via a simple slicing, similarly to what suggested in @GrzegorzSkibinski answer:

import numpy as np


def gen_bits_by_val(values):
    n = int(max(values)).bit_length()
    return np.unpackbits(values, axis=1)[:, -n:].copy()


print(gen_bits_by_val(a))
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

Alternatively, you could create a look-up table, similarly to what suggested in @WarrenWeckesser answer, using the following:

import numpy as np


def gen_bits_by_num(n):
    values = np.arange(2 ** n, dtype=np.uint8).reshape(-1, 1)
    return np.unpackbits(values, axis=1)[:, -n:].copy()


bits2 = gen_bits_by_num(2)
print(bits2)
# [[0 0]
#  [0 1]
#  [1 0]
#  [1 1]]

which allows for all kind of uses thereby indicated, e.g.:

bits4 = gen_bits_by_num(4)
print(bits4[[1, 3, 12]])
# [[0 0 0 1]
#  [0 0 1 1]
#  [1 1 0 0]]

EDIT

Considering @PaulPanzer answer the line:

return np.unpackbits(values, axis=1)[:, -n:]

has been replaced with:

return np.unpackbits(values, axis=1)[:, -n:].copy()

which is more memory efficient.

It could have been replaced with:

return np.unpackbits(values << (8 - n), axis=1, count=n)

with similar effects.

Sign up to request clarification or add additional context in comments.

Comments

1

You can use the count keyword. It cuts from the right so you also have to shift bits before applying unpackbits.

b = np.unpackbits(a<<6, axis=1, count=2)
b
# array([[0, 0],
#        [0, 1],
#        [1, 0],
#        [1, 1]], dtype=uint8)

This produces a "clean" array:

b.flags
#  C_CONTIGUOUS : True
#  F_CONTIGUOUS : False
#  OWNDATA : True
#  WRITEABLE : True
#  ALIGNED : True
#  WRITEBACKIFCOPY : False
#  UPDATEIFCOPY : False

In contrast, slicing the full 8-column output of unpackbits is in a sense a memory leak because the discarded columns will stay in memory as long as the slice lives.

Comments

0

You can truncate b to keep just the columns since the first column with 1:

b=b[:, int(np.argwhere(b.max(axis=0)==1)[0]):]

1 Comment

You could also use int(max(a)).bit_length() to determine where to cut, which is probably faster.
0

For such a small number of bits, you can use a lookup table.

For example, here bits2 is an array with shape (4, 2) that holds the bits of the integers 0, 1, 2, and 3. Index bits2 with the values from a to get the bits:

In [43]: bits2 = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

In [44]: a = np.array([[0], [1], [2], [3]], dtype=np.uint8)

In [45]: bits2[a[:, 0]]
Out[45]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1]])

This works fine for 3 or 4 bits, too:

In [46]: bits4 = np.array([[0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0], [0, 0, 1, 1], [0, 1, 0, 0], [
    ...: 0, 1, 0, 1], [0, 1, 1, 0], [0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 1], [1, 0, 1, 0], [1, 0,
    ...:  1, 1], [1, 1, 0, 0], [1, 1, 0, 1], [1, 1, 1, 0], [1, 1, 1, 1]])

In [47]: bits4
Out[47]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 1, 0],
       [0, 0, 1, 1],
       [0, 1, 0, 0],
       [0, 1, 0, 1],
       [0, 1, 1, 0],
       [0, 1, 1, 1],
       [1, 0, 0, 0],
       [1, 0, 0, 1],
       [1, 0, 1, 0],
       [1, 0, 1, 1],
       [1, 1, 0, 0],
       [1, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 1, 1, 1]])

In [48]: x = np.array([0, 1, 5, 14, 9, 8, 15])

In [49]: bits4[x]
Out[49]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 1, 0, 1],
       [1, 1, 1, 0],
       [1, 0, 0, 1],
       [1, 0, 0, 0],
       [1, 1, 1, 1]])

3 Comments

Sure, but it's always tedious to write one and np.arange() + np.unpackbits() + slicing is a good way of building one.
"... it's always tedious..." Hmm... tedious? You do it once!
Call me a weirdo but I prefer list(range(10)) over [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], even if I have to type the spelled-out version only once... or, using code from this question, I prefer to define bits4 via gen_bits_by_num() from here than the way you suggest defining bits4.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.