2

Python beginner here. I have a large set of data that started of as a string of 16 bit ints, "1,2,3,4,5" and eventually need to turn into a byte aligned binary file.

Currently I have it working with the following:

#helper function
def unintlist2hex(list_input):
    for current in range(len(list_input)):
        list_input[current] = "%04X"%(int(list_input[current]))
return list_input

#where helper gets called in main code
for rows in dataset:
    row_list = rows.text.split(",")
    f_out.write(binascii.unhexlify("".join(unintlist2hex(row_list))))

but this runs quite slow up for my limited data test size(about 300,000 ints). How could I go about speeding it up? I profiled the code and most of the all the cycles are spent in unintlist2hex()

Note that I struggled to use hex(), and bin() because they had a tendency to truncate leading zeros.

2
  • I don't think you understand how data works. You are creating strings with the characters for "0" and "1" in them. That is not the same thing as setting 0 and 1 bits in a byte. Commented Aug 5, 2014 at 4:56
  • @KarlKnechtel I'm trying to really reflect on what you are saying here and missing your point. Isn't setting a 0x0 equivalent to creating a 0000 byte? Is your comment directed to "%04X" hex conversion? Commented Aug 5, 2014 at 5:35

1 Answer 1

1

The struct module is probably best for this

>>> import struct
>>> struct.pack("5I", *(int(x) for x in "1,2,3,4,5".split(",")))
'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00'

You can use > or < to set the endianess

>>> struct.pack(">5I", *(int(x) for x in "1,2,3,4,5".split(",")))
'\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05'

eg:

for rows in dataset:
    row_list = [int(x) for x in rows.text.split(",")]
    f_out.write(struct.pack("{}I".format(len(row_list)), *row_list))
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Ran the code and got about 2x factor, so that's pretty interesting. edit for prosperity sake: row_list = [int(x) for x in rows.text.split(",")]
Ah, yes should be a list comprehension of course :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.