4

I have value of the type bytes that need to be converted to BIT STRING

bytes_val = (b'\x80\x00', 14)

the bytes in index zero need to be converted to bit string of length as indicated by the second element (14 in this case) and formatted as groups of 8 bits like below.

expected output => '10000000 000000'B

Another example

bytes_val2 = (b'\xff\xff\xff\xff\xf0\x00', 45) #=> '11111111 11111111 11111111 11111111 11110000 00000'B
6
  • …and the 14…? Commented Mar 7, 2020 at 15:40
  • Does this answer your question? Whats a simple way to turn bytes into a binary string? Commented Mar 7, 2020 at 15:42
  • Does this answer your question? Convert string to binary in python Commented Mar 7, 2020 at 15:50
  • @Błotosmętek - 14 is length of the expected bit string !! Commented Mar 8, 2020 at 6:51
  • Should (b'\x80\x01', 14) produces also '1000000 000000'B? Commented Apr 6, 2020 at 8:39

7 Answers 7

6
+50

What about some combination of formatting (below with f-string but can be done otherwise), and slicing:

def bytes2binstr(b, n=None):
    s = ' '.join(f'{x:08b}' for x in b)
    return s if n is None else s[:n + n // 8 + (0 if n % 8 else -1)]

If I understood correctly (I am not sure what the B at the end is supposed to mean), it passes your tests and a couple more:

func = bytes2binstr
args = (
    (b'\x80\x00', None),
    (b'\x80\x00', 14),
    (b'\x0f\x00', 14),
    (b'\xff\xff\xff\xff\xf0\x00', 16),
    (b'\xff\xff\xff\xff\xf0\x00', 22),
    (b'\x0f\xff\xff\xff\xf0\x00', 45),
    (b'\xff\xff\xff\xff\xf0\x00', 45),
)
for arg in args:
    print(arg)
    print(repr(func(*arg)))
# (b'\x80\x00', None)
# '10000000 00000000'
# (b'\x80\x00', 14)
# '10000000 000000'
# (b'\x0f\x00', 14)
# '00001111 000000'
# (b'\xff\xff\xff\xff\xf0\x00', 16)
# '11111111 11111111'
# (b'\xff\xff\xff\xff\xf0\x00', 22)
# '11111111 11111111 111111'
# (b'\x0f\xff\xff\xff\xf0\x00', 45)
# '00001111 11111111 11111111 11111111 11110000 00000'
# (b'\xff\xff\xff\xff\xf0\x00', 45)
# '11111111 11111111 11111111 11111111 11110000 00000'

Explanation

  • we start from a bytes object
  • iterating through it gives us a single byte as a number
  • each byte is 8 bit, so decoding that will already give us the correct separation
  • each byte is formatted using the b binary specifier, with some additional formatting: 0 zero fill, 8 minimum length
  • we join (concatenate) the result of the formatting using ' ' as "separator"
  • finally the result is returned as is if a maximum number of bits n was not specified (set to None), otherwise the result is cropped to n + the number of spaces that were added in-between the 8-character groups.

In the solution above 8 is somewhat hard-coded. If you want it to be a parameter, you may want to look into (possibly a variation of) @kederrac first answer using int.from_bytes(). This could look something like:

def bytes2binstr_frombytes(b, n=None, k=8):
    s = '{x:0{m}b}'.format(m=len(b) * 8, x=int.from_bytes(b, byteorder='big'))[:n]
    return ' '.join([s[i:i + k] for i in range(0, len(s), k)])

which gives the same output as above.

Speedwise, the int.from_bytes()-based solution is also faster:

for i in range(2, 7):
    n = 10 ** i
    print(n)
    b = b''.join([random.randint(0, 2 ** 8 - 1).to_bytes(1, 'big') for _ in range(n)])
    for func in funcs:
        print(func.__name__, funcs[0](b, n * 7) == func(b, n * 7))
        %timeit func(b, n * 7)
    print()
# 100
# bytes2binstr True
# 10000 loops, best of 3: 33.9 µs per loop
# bytes2binstr_frombytes True
# 100000 loops, best of 3: 15.1 µs per loop

# 1000
# bytes2binstr True
# 1000 loops, best of 3: 332 µs per loop
# bytes2binstr_frombytes True
# 10000 loops, best of 3: 134 µs per loop

# 10000
# bytes2binstr True
# 100 loops, best of 3: 3.29 ms per loop
# bytes2binstr_frombytes True
# 1000 loops, best of 3: 1.33 ms per loop

# 100000
# bytes2binstr True
# 10 loops, best of 3: 37.7 ms per loop
# bytes2binstr_frombytes True
# 100 loops, best of 3: 16.7 ms per loop

# 1000000
# bytes2binstr True
# 1 loop, best of 3: 400 ms per loop
# bytes2binstr_frombytes True
# 10 loops, best of 3: 190 ms per loop
Sign up to request clarification or add additional context in comments.

3 Comments

String formatting has a perfectly serviceable binary formatting option for integers, one that doesn’t require slicing of a prefix. Why not use that?
@MartijnPieters I was assuming that > and 0 specifier would not have worked for b, but I was wrong. Thanks for the improvement!
You don't need > at all, it's the default alignment for numbers (each x in the bytes object b is an integer in the interval [0, 256)).
2

you can use:

def bytest_to_bit(by, n):
    bi = "{:0{l}b}".format(int.from_bytes(by, byteorder='big'), l=len(by) * 8)[:n]
    return ' '.join([bi[i:i + 8] for i in range(0, len(bi), 8)])

bytest_to_bit(b'\xff\xff\xff\xff\xf0\x00', 45)

output:

'11111111 11111111 11111111 11111111 11110000 00000'

steps:

  1. transform your bytes to an integer using int.from_bytes

  2. str.format method can take a binary format spec.


also, you can use a more compact form where each byte is formatted:

def bytest_to_bit(by, n):
    bi = ' '.join(map('{:08b}'.format, by))
    return bi[:n + len(by) - 1].rstrip()

bytest_to_bit(b'\xff\xff\xff\xff\xf0\x00', 45)

2 Comments

Eventually, you may need some extra formatting to handle corner cases for inputs whose binary representation does not start with a 1. Not specified in the question, though, so not 100% sure, but e.g. (b'\0f\00', 14) would not play out well if the required output must have leading 0s.
Without a minimum width leading zeros will be dropped from your output, so (b'\x00\x00', 16) produces '0', not '0000000000000000'. You would want to use "{:0{l}b}".format(int.from_bytes(by, byteorder='big'), l=len(by) * 8)[:n].
0
test_data = [
    (b'\x80\x00', 14),
    (b'\xff\xff\xff\xff\xf0\x00', 45),
]


def get_bit_string(bytes_, length) -> str:
    output_chars = []
    for byte in bytes_:
        for _ in range(8):
            if length <= 0:
                return ''.join(output_chars)
            output_chars.append(str(byte >> 7 & 1))
            byte <<= 1
            length -= 1
        output_chars.append(' ')
    return ''.join(output_chars)


for data in test_data:
    print(get_bit_string(*data))

output:

10000000 000000
11111111 11111111 11111111 11111111 11110000 00000

explanation:

  • length: Start from target legnth, and decreasing to 0.
  • if length <= 0: return ...: If we reached target length, stop and return.
  • ''.join(output_chars): Make string from list.
  • str(byte >> 7 & 1)
    • byte >> 7: Shift 7 bits to right(only remains MSB since byte has 8 bits.)
    • MSB means Most Significant Bit
    • (...) & 1: Bit-wise and operation. It extracts LSB.
  • byte <<= 1: Shift 1 bit to left for byte.
  • length -= 1: Decreasing length.

2 Comments

would you please add some descriptions on what the different lines do? - Thanks !!
I appended explanation :)
0

This is lazy version.
It neither loads nor processes the entire bytes.
This one does halt regardless of input size.
The other solutions may not!

I use collections.deque to build bit string.

from collections import deque
from itertools import chain, repeat, starmap
import os  

def bit_lenght_list(n):
    eights, rem = divmod(n, 8)
    return chain(repeat(8, eights), (rem,))


def build_bitstring(byte, bit_length):
    d = deque("0" * 8, 8)
    d.extend(bin(byte)[2:])
    return "".join(d)[:bit_length]


def bytes_to_bits(byte_string, bits):
    return "{!r}B".format(
        " ".join(starmap(build_bitstring, zip(byte_string, bit_lenght_list(bits))))
    )

Test;

In [1]: bytes_ = os.urandom(int(1e9)) 

In [2]: timeit bytes_to_bits(bytes_, 0)                                                                                                                   
4.21 µs ± 27.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [3]: timeit bytes_to_bits(os.urandom(1), int(1e9))                                                                                                 
6.8 µs ± 51 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [4]: bytes_ = os.urandom(6)                                                                                                                        

In [5]: bytes_                                                                                                                                       
Out[5]: b'\xbf\xd5\x08\xbe$\x01'

In [6]: timeit bytes_to_bits(bytes_, 45)  #'10111111 11010101 00001000 10111110 00100100 00000'B                                                                                                        
12.3 µs ± 85 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [7]:  bytes_to_bits(bytes_, 14)                                                                                                                   
Out[7]: "'10111111 110101'B"

Comments

-1

when you say BIT you mean binary? I would try

bytes_val = b'\\x80\\x00'

for byte in bytes_val:
    value_in_binary = bin(byte)

1 Comment

BIT String - string representation of binary.
-1

This gives the answer without python's binary representation pre-fixed 0b:

bit_str = ' '.join(bin(i).replace('0b', '') for i in bytes_val)

3 Comments

getting TypeError: 'bytes' object cannot be interpreted as an integer
Is that Python 2.x or 3.x?
>>> sys.version_info sys.version_info(major=3, minor=6, micro=8, releaselevel='final', serial=0) >>> bytes_val = (b'\x80\x00', 14) >>> ' '.join(bin(i).replace('0b', '') for i in bytes_val) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <genexpr> TypeError: 'bytes' object cannot be interpreted as an integer >>>
-1

This works in Python 3.x:

def to_bin(l):
    val, length = l
    bit_str = ''.join(bin(i).replace('0b', '') for i in val)
    if len(bit_str) < length:
        # pad with zeros
        return '0'*(length-len(bit_str)) + bit_str
    else:
        # cut to size
        return bit_str[:length]

bytes_val = [b'\x80\x00',14]
print(to_bin(bytes_val))

and this works in 2.x:

def to_bin(l):
    val, length = l
    bit_str = ''.join(bin(ord(i)).replace('0b', '') for i in val)
    if len(bit_str) < length:
        # pad with zeros
        return '0'*(length-len(bit_str)) + bit_str
    else:
        # cut to size
        return bit_str[:length]

bytes_val = [b'\x80\x00',14]
print(to_bin(bytes_val))

Both produce result 00000100000000

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.