5

using the following code i am trying to convert a list of numbers into binary number but getting an error

import numpy as np

lis=np.array([1,2,3,4,5,6,7,8,9])
a=np.binary_repr(lis,width=32)

the error after running the program is

Traceback (most recent call last):

File "", line 4, in a=np.binary_repr(lis,width=32)

File "C:\Users.......", in binary_repr if num == 0:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

any way to fix this?

4
  • np.binary_repr works on a single value, not on an array-like object. Commented Sep 7, 2019 at 19:45
  • i know that .but is there any way to make it work in an array ..? Commented Sep 7, 2019 at 19:46
  • or is there any code in python whereby i can convert my whole array or list into binary 32 bit? Commented Sep 7, 2019 at 19:48
  • Are you looking to get a string output or something else? Commented Sep 7, 2019 at 20:04

5 Answers 5

4

You can use np.vectorize to overcome this issue.

>>> lis=np.array([1,2,3,4,5,6,7,8,9])
>>> a=np.binary_repr(lis,width=32)
>>> binary_repr_vec = np.vectorize(np.binary_repr)
>>> binary_repr_vec(lis, width=32)
array(['00000000000000000000000000000001',
       '00000000000000000000000000000010',
       '00000000000000000000000000000011',
       '00000000000000000000000000000100',
       '00000000000000000000000000000101',
       '00000000000000000000000000000110',
       '00000000000000000000000000000111',
       '00000000000000000000000000001000',
       '00000000000000000000000000001001'], dtype='<U32')
Sign up to request clarification or add additional context in comments.

Comments

3

As the documentation on binary_repr says:

num : int

    Only an integer decimal number can be used.

You can however vectorize this operation, like:

np.vectorize(np.binary_repr)(lis, 32)

this then gives us:

>>> np.vectorize(np.binary_repr)(lis, 32)
array(['00000000000000000000000000000001',
       '00000000000000000000000000000010',
       '00000000000000000000000000000011',
       '00000000000000000000000000000100',
       '00000000000000000000000000000101',
       '00000000000000000000000000000110',
       '00000000000000000000000000000111',
       '00000000000000000000000000001000',
       '00000000000000000000000000001001'], dtype='<U32')

or if you need this often, you can store the vectorized variant in a variable:

binary_repr_vector = np.vectorize(np.binary_repr)
binary_repr_vector(lis, 32)

Which of course gives the same result:

>>> binary_repr_vector = np.vectorize(np.binary_repr)
>>> binary_repr_vector(lis, 32)
array(['00000000000000000000000000000001',
       '00000000000000000000000000000010',
       '00000000000000000000000000000011',
       '00000000000000000000000000000100',
       '00000000000000000000000000000101',
       '00000000000000000000000000000110',
       '00000000000000000000000000000111',
       '00000000000000000000000000001000',
       '00000000000000000000000000001001'], dtype='<U32')

Comments

3

Approach #1

Here's a vectorized one for an array of numbers, upon leveraging broadcasting -

def binary_repr_ar(A, W):
    p = (((A[:,None] & (1 << np.arange(W-1,-1,-1)))!=0)).view('u1')
    return p.astype('S1').view('S'+str(W)).ravel()

Sample run -

In [67]: A
Out[67]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [68]: binary_repr_ar(A,32)
Out[68]: 
array(['00000000000000000000000000000001',
       '00000000000000000000000000000010',
       '00000000000000000000000000000011',
       '00000000000000000000000000000100',
       '00000000000000000000000000000101',
       '00000000000000000000000000000110',
       '00000000000000000000000000000111',
       '00000000000000000000000000001000',
       '00000000000000000000000000001001'], dtype='|S32')

Approach #2

Another vectorized one with array-assignment -

def binary_repr_ar_v2(A, W):
    mask = (((A[:,None] & (1 << np.arange(W-1,-1,-1)))!=0))
    out = np.full((len(A),W),48, dtype=np.uint8)
    out[mask] = 49
    return out.view('S'+str(W)).ravel()

Alternatively, use the mask directly to get the string array -

def binary_repr_ar_v3(A, W):
    mask = (((A[:,None] & (1 << np.arange(W-1,-1,-1)))!=0))
    return (mask+np.array([48],dtype=np.uint8)).view('S'+str(W)).ravel()

Note that the final output would be a view into one of the intermediate outputs. So, if you need it to have it own memory space, simply append with .copy().


Timings on a large sized input array -

In [49]: np.random.seed(0)
    ...: A = np.random.randint(1,1000,(100000))
    ...: W = 32

In [50]: %timeit binary_repr_ar(A, W)
    ...: %timeit binary_repr_ar_v2(A, W)
    ...: %timeit binary_repr_ar_v3(A, W)
1 loop, best of 3: 854 ms per loop
100 loops, best of 3: 14.5 ms per loop
100 loops, best of 3: 7.33 ms per loop

From other posted solutions -

In [22]: %timeit [np.binary_repr(i, width=32) for i in A]
10 loops, best of 3: 97.2 ms per loop

In [23]: %timeit np.frompyfunc(np.binary_repr,2,1)(A,32).astype('U32')
10 loops, best of 3: 80 ms per loop

In [24]: %timeit np.vectorize(np.binary_repr)(A, 32)
10 loops, best of 3: 69.8 ms per loop

On @Paul Panzer's solutions -

In [5]: %timeit bin_rep(A,32)
548 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [6]: %timeit bin_rep(A,31)
2.2 ms ± 5.55 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

1 Comment

Thx for the timings.
2

Here is a fast method using np.unpackbits

(np.unpackbits(lis.astype('>u4').view(np.uint8))+ord('0')).view('S32')
# array([b'00000000000000000000000000000001',
#        b'00000000000000000000000000000010',
#        b'00000000000000000000000000000011',
#        b'00000000000000000000000000000100',
#        b'00000000000000000000000000000101',
#        b'00000000000000000000000000000110',
#        b'00000000000000000000000000000111',
#        b'00000000000000000000000000001000',
#        b'00000000000000000000000000001001'], dtype='|S32')

More general:

def bin_rep(A,n):
    if n in (8,16,32,64):
        return (np.unpackbits(A.astype(f'>u{n>>3}').view(np.uint8))+ord('0')).view(f'S{n}')
    nb = max((n-1).bit_length()-3,0)
    return (np.unpackbits(A.astype(f'>u{1<<nb}')[...,None].view(np.uint8),axis=1)[...,-n:]+ord('0')).ravel().view(f'S{n}')

Note: special casing n = 8,16,32,64 is absolutely worth it since it gives a severalfold speedup for these numbers.

Also note that this method maxes out at 2^64, larger ints require a different approach.

4 Comments

Is it generalizable to generic window lengths? Are there restrictions to the extents of the number values?
@Divakar Well, it is with some work ;-) And obviously beyond uint64 it is getting hairy.
Looks promising and obviously a smart one keeping those aside.
@Divakar I added a general version. Didn't try to handle ints >= 2^64, though. It is quite a bit slower for non-aligned sizes but still pretty fast.
1
In [193]: alist = [1,2,3,4,5,6,7,8,9]                                                                        

np.vectorize is convenient, but not fast:

In [194]: np.vectorize(np.binary_repr)(alist, 32)                                                            
Out[194]: 
array(['00000000000000000000000000000001',
       '00000000000000000000000000000010',
       '00000000000000000000000000000011',
        ....
       '00000000000000000000000000001001'], dtype='<U32')
In [195]: timeit np.vectorize(np.binary_repr)(alist, 32)                                                     
71.8 µs ± 1.88 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

plain old list comprehension is better:

In [196]: [np.binary_repr(i, width=32) for i in alist]                                                       
Out[196]: 
['00000000000000000000000000000001',
 '00000000000000000000000000000010',
 '00000000000000000000000000000011',
...
 '00000000000000000000000000001001']
In [197]: timeit [np.binary_repr(i, width=32) for i in alist]                                                
11.5 µs ± 181 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

another iterator:

In [200]: timeit np.frompyfunc(np.binary_repr,2,1)(alist,32).astype('U32')                                   
30.1 µs ± 1.79 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

1 Comment

For small arrays that is indeed the case, for larger arrays, like lis = np.arange(1000), one sees that the list comprehension usually takes two times as much time. The frompyfunc, takes approximately 10% more time. This is not that strange, since numpy is definitely not a good idea for small amounts of data, it only pays off if you process data "in bulk".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.