Read Columns of different types in a Numpy Array

Question

I am trying to read a .csv file using Numpy. The .csv file has this format:

U118,V078,3
U106,V091,2
U042,V057,5

I used numpy.genfromtxt function defining the data types in the argument:

data = np.genfromtxt('DATASET.csv', delimiter=",",names=['usuario','videojuego','puntuacion'],
                     dtype='str,str,int')

But what I am actually getting is only the int (3rd column) column:

> [('', '', 3) ('', '', 2) ('', '', 5) ('', '', 0) ('', '', 3) ('', '',
> 5)

Does someone know what I am missing?

np.genfromtxt expects a list of dtypes to be assigned to attributes. So use dtype = ['str', 'int', 'int'] — DOOM
– DOOM, Commented Mar 21, 2020 at 11:20
If you look at the dtype you'll see it's U0', string type with space for 0 characters. Some places it's ok to use str` as the dtype, but for others, such as this, you need to specify the length. — hpaulj
– hpaulj, Commented Mar 21, 2020 at 15:18

Jay · Accepted Answer · 2020-03-21 11:19:12Z

1

Are you using the correct numpy nomeclature inside dtype? See here.

If you're using a string to pass all dtypes, then perhaps something like

dtype = "S4,S4,i8"

answered Mar 21, 2020 at 11:19

Jay

2,9485 gold badges36 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Or in Py3, I'd use 'U4, U4, I8', the default string type.