3

I am trying to convert a numpy integer array, let's say A=[3,5,2], into a numpy binary array with least significant bit first format and specific length. That is, the outcome for length 6 should be as follows:

A' = [1 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0]

The first 6 values are for the first element of A, the second 6 of those are for the second element of A and the last 6 of those for the last element of A.

My current solution is as follows:

np.multiply( np.delete( np.unpackbits( np.abs(A.astype(int)).view("uint8")).reshape(-1,8)[:,::-1].reshape(-1,64), np.s_[ln::],1).astype("float64").ravel(), np.repeat(np.sign(A), ln))

where ln represents the specific ln (in the example, it was 6)

Is there any faster way to do this?

Thanks in advance.

EDIT: I should have pointed out before. A can also have negative values. For instance, if A=[-11,5] and ln=6, then the returned array should be:

A'=[-1 -1 0 -1 0 0 1 0 1 0 0 0]

Note that ln=6 is just an example. It could be even 60.

Sorry for missing this part of the requirement.

0

3 Answers 3

2

Here's a vectorized one -

((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')

Another with np.unpackbits -

np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()

Sample run -

In [197]: A
Out[197]: array([3, 5, 2])

In [198]: ln = 6

In [199]: ((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')
Out[199]: array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=int8)

In [200]: np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()
Out[200]: array([1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0], dtype=uint8)

Timings on a large array -

In [201]: A = np.random.randint(0,6,1000000)

In [202]: ln = 6

In [203]: %timeit ((A[:,None] & (1 << np.arange(ln)))!=0).ravel().view('i1')
10 loops, best of 3: 32.1 ms per loop

In [204]: %timeit np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1].ravel()
100 loops, best of 3: 8.14 ms per loop

If you are okay with a 2D array output with each row holding binary info for each element off the input, it's much better -

In [205]: %timeit np.unpackbits(A.view(np.uint8)[::8]).reshape(-1,8)[:,ln-7:1:-1]
1000 loops, best of 3: 1.04 ms per loop

Other posted approaches -

# @aburak's soln
In [206]: %timeit np.multiply( np.delete( np.unpackbits( np.abs(A.astype(int)).view("uint8")).reshape(-1,8)[:,::-1].reshape(-1,64), np.s_[ln::],1).astype("float64").ravel(), np.repeat(np.sign(A), ln))
10 loops, best of 3: 180 ms per loop

# @Jacques Gaudin's soln
In [208]: %timeit np.array([int(c) for i in A for c in np.binary_repr(i, width=6)[::-1]])
1 loop, best of 3: 3.34 s per loop

# @Paul Panzer's soln
In [209]: %timeit np.unpackbits(A[:, None].view(np.uint8)[..., ::-1] if sys.byteorder=='little' else A[:, None].view(np.uint8), axis=-1)[..., :-ln-1:-1].reshape(-1)
10 loops, best of 3: 35.4 ms per loop

The best thing that worked in favour of the second approach from this post is that we have an uint8 dtype version of the input, which is simply a view into the input and hence memory efficient -

In [238]: A
Out[238]: array([3, 5, 2])

In [239]: A.view(np.uint8)[::8]
Out[239]: array([3, 5, 2], dtype=uint8)

In [240]: np.shares_memory(A,A.view(np.uint8)[::8])
Out[240]: True

So, when we use np.unpackbits, we are feeding in the same number of elements as the original one.

Also, A.view(np.uint8)[::8] seems like a good trick to view an int dtype array as an uint8 one!


To solve for generic case, we could extend the earlier listed approaches.

Approach #1 (for ln upto 63) :

(((np.abs(A)[:,None] & (1 << np.arange(ln)))!=0)*np.sign(A)[:,None]).ravel()

Approach #2 :

a = np.abs(A)
m = ((ln-1)//8)+1
b = a.view(np.uint8).reshape(-1,8)[:,:m]
U = np.unpackbits(b,axis=1)
out = U.reshape(-1,m,8)[...,::-1].reshape(len(A),-1)[...,:ln]
out = (out*np.sign(A)[:,None]).ravel()
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the benchmark, I didn't dare publishing the results 😖
I tried to extend your solution to the cases where we have ln>6 such as 60. I somehow got lost among the indices. Can you point out the solution for it?
2

You can do so by using binary_repr:

arr = np.array([3,5,2])
res = np.array([int(c) for i in arr for c in np.binary_repr(i, width=6)[::-1]])

>>>[1 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0]

The [::-1] is a trick to iterate through the string in reverse order: the step of the iteration is set to -1. For more details refer to the extended slices docs.

Or with a format string (it starts to look like code golf though):

res = np.array([int(c) for i in arr for c in f'{i:06b}'[::-1]])

f'{i:06b}' is a string representing i in binary with 6 digits and leading zeros.

Speed-wise, this is very slow... Sorry I didn't get that bit of the question!

Comments

1

Maybe I'm ignorant to your solution's full power but it seems to have a few non essential ingredients.

Here is a streamlined version. It checks for endianness and should be good for up to 64 bit on typical platforms.

A = np.arange(-2, 3)*((2**40)-1)
ln = 60

np.unpackbits(np.abs(A[..., None]).view(np.uint8)[..., ::-1] if sys.byteorder=='little' else np.abs(A[..., None]).view(np.uint8), axis=-1)[..., :-ln-1:-1].view(np.int8) * np.sign(A[:, None]).astype(np.int8)

Output

array([[ 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1, -1, -1, -1, -1,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]], dtype=int8)

1 Comment

That was my mistake. I should have said that there could be negative values in the array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.