How to store csv values form a file to a numpy array in Python?

Question

I've written a Python script that reads a b&w bitmap image and stores the value of each pixel as a hex value from 0x00 to 0xFF in a .txt file. The values are stored as a continuous 1D array separated by commas, and in order to not have very wide files the lines are ~~only~~ as maximum as 16 elements in length, e.g.:

v01, v02, ... , v15, v16,
v17, v18, ... , v31, v32,
...

0x00, 0x00, ... , 0x00, 0x00,
0x00, 0x00, ... , 0x00, 0x00,
...

Notice that the last element of each line also has a comma

Of course the .txt file doesn't keep the original dimensions of the bitmap, but it is not an issue because it will be later used in a micro-controller firmware, which knows the original dimensions and takes care of properly reading the 1D array.

Now, in order to verify that the conversion is done properly, I need to write a script that reads the file and stores the values in a numpy array that is used to display the image later with "matplotlib". I've tried the following code:

my_data = genfromtxt('file.txt', delimiter=',')
print(my_data)

The issue with that is, apart of the wrong dimensions, that the hex values are not read as numbers and that the element after the last coma of the row is also read (the break character I guess). I get something like:

[nan, nan, ... , nan, nan," "
...]

I need a way of reading the .txt file, converting the values from a "0x00" format to a numeric value and storing then in a m x n numpy array (m & n are known parameters, the original bitmap size):

[[0, 0, ... , 0, 0]
 [0, 0, ... , 0, 0]
 ...]

Any suggestions on how to do so?

Update

While writing the question I was only working with files that were multiples of 16 pixels in width, that guaranteed that my csv outputs had always 16 elements in all the rows. But after some testing I came across a picture, the size of which made the last row of the csv to be less than 16 elements. In that case I was not able to use the solution provided by @taras, but still the answer was correct as per my initial question.

Finally I ended up with the following code, maybe not as elegant but does the trick:

with open(filename,"r") as f:
        pixels=[x.split(',') for x in f.readlines()]
        for p in pixels:
            del p[-1]
        pixels = [int(p,16) for row in pixels for p in row]
        pixels = np.asarray(pixels, dtype=np.uint8).reshape(h,w)

I'm keeping both answers in case somebody finds them useful.

taras · Accepted Answer · 2020-04-01 17:04:35Z

1

Since you have a fixed number of columns you can make use of it to read the first 16 columns only (it will let you to strip a trailing comma) and convert each column from hex using converters dict with int(x, 16):

import numpy as np

fname = 'file.txt'
num_cols = 16
np.loadtxt(fname, usecols=range(num_cols), dtype=np.uint8, delimiter=',',
           converters={k: lambda x: int(x, 16) for k in range(num_cols)})

Edit:
If the number of elements in the file is not a multiple of 16, you can use regular python code to preprocess data and then convert it into numpy array:

import numpy as np

fname = 'file.txt'
with open(fname) as fp:
    data = fp.read().replace('\n', '')
np.array([int(x, 16) for x in data.split(',')])

edited Apr 1, 2020 at 17:04

answered Mar 29, 2020 at 16:07

taras

6,93510 gold badges46 silver badges54 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Nedo Over a year ago

You sir are a magician, worked like a charm, thank you. Only one thing, can you elaborate a bit more the "converters={...}" part? Or tell me where to find more information about that? I'm quite new to Python.

Nedo Over a year ago

Just one thing that I realized, to completely answer the question I need to reshape the np.loadtxt output with np.reshape(output, (m,n))

taras Over a year ago

@Nedo, you can read about converters here docs.scipy.org/doc/numpy/reference/generated/… and see more elaborated examples here docs.scipy.org/doc/numpy/user/…

taras Over a year ago

@Nedo, you can actually reshape the return value of np.loadtxt directly: np.loadtxt(fname, ...).reshape((m, n))

Nedo Over a year ago

I was getting some successful results but then I found a pitfall. My bad for not specifying this, but the rows are as maximum 16 elements in length, not exactly 16. It was working fine because I was using images with widths multiple of 16, however the script failed for images of different width (the last row of the csv had less than 16 elements). I was checking the links you shared but it says that the documents must have a fixed width, any ideas on this?

Collectives™ on Stack Overflow

How to store csv values form a file to a numpy array in Python?

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related