1

I have a Python string of bytes data. An example string looks like this:

string = "b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"

It is a string, it not not bytes. I wish to convert it to bytes. Normal approaches (like encode) yield this:

b'\\xabVJ-K\\xcd+Q\\xb2R*.M*N.\\xcaLJU\\xd2QJ\\xceH\\xcc\\xcbK\\xcd\\x01\\x89\\x16\\xe4\\x97\\xe8\\x97d&g\\xa7\\x16Y\\x85\\x06\\xbb8\\xeb\\x02\\t\\xa5Z\\x00'

which leads to issues (note the addition of all the extra slashes).

I've looked through 10+ potential answers to this question on SO and only one of them works, and its a solution I'd prefer not to use, for obvious reasons:

this_works = eval(string)

Is there any way to get this to work without eval? Other potential solutions I've tried, that failed:

Option 1 Option 2 Option 3

2
  • The third option should work, once you remove the extraneous 'b' and quotes. s[2:-1].encode('latin') Commented Apr 5, 2021 at 14:40
  • 1
    I still get the same error I mentioned above, addition of extra backslashes Commented Apr 5, 2021 at 14:54

1 Answer 1

1

I assume that you have python-like string representation in variable s:

s = r"b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"

Yes, if you eval this then you got real python bytes object. But you can try parse it with ast module:

import ast
s = r"b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'"
tree = ast.parse(s)
value = tree.body[0].value.value
print(type(value), value)

This will output your bytes object:

<class 'bytes'> b'\xabVJ-K\xcd+Q\xb2R*.M*N.\xcaLJU\xd2QJ\xceH\xcc\xcbK\xcd\x01\x89\x16\xe4\x97\xe8\x97d&g\xa7\x16Y\x85\x06\xbb8\xeb\x02\t\xa5Z\x00'
Sign up to request clarification or add additional context in comments.

4 Comments

yeah, I dont want to use eval or the like, I'm reading this data from a file, its a mixed format file (so has normal text with binary/bytes data). Just evaluating like that seems like an issue
actually, I stand corrected - it looks like ast parse will not eval anything, let me give this a try
It does work. I wish there were a better way, but thats ok. I wonder if there is a way to do it with struct. I tried a few things, but couldnt get any of them to work.
I suppose that there is no simple solution because representation is very python-specific (non-ascii bytes in \x00 format and special wrapping b'...'). Python struct just works with raw bytes objects.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.