3

I have a number of Python bytes objects stored in a text file, that Python prints like "b'\x80\x03}q\x00.'" How do I convert each of these back into a bytes object?

In other words, I'm trying to find a function that does convert("b'\x80\x03}q\x00.'") == b'\x80\x03}q\x00.'.

I feel like this should be trivial, but none of these obvious approaches worked:

>>> s = "b'\x80\x03}q\x00.'"
>>> bytes(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: string argument without an encoding
>>> bytes(s.encode())
b"b'\xc2\x80\x03}q\x00.'"
>>> bytes(s[2:-1].encode())
b'\xc2\x80\x03}q\x00.'
>>> bytes(s[2:-1].encode('utf8'))
b'\xc2\x80\x03}q\x00.'
>>> eval(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: source code string cannot contain null bytes
>>> exec(s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: source code string cannot contain null bytes
6
  • 1
    You have string representations of bytes literals, not bytes objects. How did this file get created in the first place? Commented Jan 16, 2020 at 21:02
  • 1
    Actually, you don't quite have a bytes literal. With s = "...", \x00, for example, is replace with an actual null byte, rather than remaining the 4 characters that represent a null byte in a literal. If you write s = r"...", then ast.literal_eval(s) returns the bytes object you want. Commented Jan 16, 2020 at 21:09
  • That's specific to how you set s in this example; if s is read from a file, like e.g. s = f.readline(), this isn't an issue. Commented Jan 16, 2020 at 21:10
  • s = "b'\x80\x03}q\x00.'" is a string then this should work str.encode(s) Commented Jan 16, 2020 at 21:12
  • 1
    bytes(s[2:-1].encode())[1:] # b'\x80\x03}q\x00.' Commented Jan 16, 2020 at 21:20

1 Answer 1

5

This doesn't really apply to the case where the value of s is read from a file, but in your example, the regular string literal expands the escape sequences:

>>> s = "b'\x80\x03}q\x00.'"
>>> list(s)
['b', "'", '\x80', '\x03', '}', 'q', '\x00', '.', "'"]

Note that s doesn't contain the escape sequence for a null byte; it contains an actual null byte.

You can avoid this using a raw string literal:

>>> s = r"b'\x80\x03}q\x00.'"
>>> list(s)
['b', "'", '\\', 'x', '8', '0', '\\', 'x', '0', '3', '}', 'q', '\\', 'x', '0', '0', '.', "'"]

in which case ast.literal_eval is the function you are looking for:

>>> ast.literal_eval(s)
b'\x80\x03}q\x00.'

The raw string literal should produce the value you would read from a file:

import ast

b = b'\x80\x03}q\x00.'

with open("tmp.txt", "w") as f:
    print(str(b), file=f)

with open("tmp.txt") as f:
    s = f.readline().strip()

assert ast.literal_eval(s) == b
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.