I need to do the following in Python 3.x:
- Interpret an array of bytes as an array of single-precision floats.
- Then group each four consecutive floats into subarrays, i.e. transform
[a,b,c,d,e,f,g,h]into[[a,b,c,d], [e,f,g,h]]. The subarrays are called pixels, and the array of pixels forms an image. - Flip the image vertically.
Here is what I have now:
floats = array.array('f')
floats.fromstring(tile_data)
pix = []
for y in range(tile_h - 1, -1, -1):
stride = tile_w * 4
start_index = y * stride
end_index = start_index + stride
pix.extend(floats[i:i + 4] for i in range(start_index, end_index, 4))
tile_data is the input array of raw bytes, tile_w and tile_h are respectively the width and height of the image, pix is the final upside down image.
While this code works correctly, it takes around 50 ms to complete on my machine for a 256x256 image.
Is there anything obviously slow in this code? Would numpy be a potentially good avenue for optimization?
Edit: here is a standalone program to run the code and measure performance:
import array
import random
import struct
import time
# Size of the problem.
tile_w = 256
tile_h = 256
# Generate input data.
tile_data = []
for f in (random.uniform(0.0, 1.0) for _ in range(tile_w * tile_h * 4)):
tile_data.extend(struct.pack("f", f))
tile_data = bytes(tile_data)
start_time = time.time()
# Code of interest.
floats = array.array('f')
floats.fromstring(tile_data)
pix = []
for y in range(tile_h - 1, -1, -1):
stride = tile_w * 4
start_index = y * stride
end_index = start_index + stride
pix.extend(floats[i:i + 4] for i in range(start_index, end_index, 4))
print("runtime: {0} ms".format((time.time() - start_time) * 1000))