2

I am importing a csv file

data = np.genfromtxt('na.csv', delimiter=",", dtype=[('latitude', 'f8'), ('longitude', 'f8'), ('location_id','i4'), ('location_name', 'S60'), ('location_group_id', 'i4'), ('location_group_name', 'S32')])

and considering rows by location_group_ids, one by one.

l_g_id_set = set()
l_g_id_set.update(data['location_group_id'])

for lgid in l_g_id_set:
    # rows with location group id == lgid
    group = data[data['location_group_id']==lgid]

So far, I only included latitude and longitude, which are two float values in the 0th and 1st position of the structured array from the csv file.

    # structured array of latitude-longitude
    latlon = group[list(group.dtype.names[:2])]

    # convert the structured array into numpy array of floats
    llarray = latlon.view((float, len(latlon.dtype.names)))

Now I want to include location_id, which is an integer value in the 2nd position of the array, to latlon and llarray. Rather than making this another structured array, I'd want llarray a 2D float array with 3 columns for ease of calculation.

However when I try the following, only changing 2 to 3

    # structured array of latitude-longitude
    latlon = group[list(group.dtype.names[:3])]

    # convert the structured array into numpy array of floats
    llarray = latlon.view((float, len(latlon.dtype.names)))

it fails, throwing the following error.

    llarray = latlon.view((float, len(latlon.dtype.names)))
ValueError: new type not compatible with array.

How can I fix this, and why is my fix failing?

2
  • The problem is, I think, with trying to 'view' the int data as float - without copying or overwriting. np.ones((3,),dtype=int).view(float) produces the same error. Commented Mar 7, 2014 at 21:36
  • Then, how could I fix it? Commented Mar 7, 2014 at 21:49

2 Answers 2

1

This transformation works

dtype1=[('latitude', 'f8'), ('longitude', 'f8'), ('location_id', 'f4')]
data1=data[list(data.dtype.names[:3])].astype(dtype1)

But data1.view(float) still gives the error

dtype2=[('latitude', 'f8'), ('longitude', 'f8'), ('location_id', 'f8')]
data2=data[list(data.dtype.names[:3])].astype(dtype2)
data2.view(float).reshape(-1,3)
data2.view((float,3))   # equivalent view

is ok.

Sample data:

In [211]: data[:3]
Out[211]: 
array([(1.2, 2.3, 100, 'testing', 45, 'another'),
       (1.2, 2.3, 200, 'testings', 45, 'xxx'),
       (1.2, 2.3343, 300, 'testings', 45, 'xxx')], 
      dtype=[('latitude', '<f8'), ('longitude', '<f8'), ('location_id', '<i4'), ('location_name', 'S60'), ('location_group_id', '<i4'), ('location_group_name', 'S32')])

In [212]: data2[:3].view(np.float).reshape(-1,3)
Out[212]: 
array([[   1.2   ,    2.3   ,  100.    ],
       [   1.2   ,    2.3   ,  200.    ],
       [   1.2   ,    2.3343,  300.    ]])

In [230]: data2.view(np.float).reshape(-1,3).max(axis=0)
Out[230]: array([   1.2   ,    2.3343,  300.    ])
In [234]: data2['longitude'].max()
Out[234]: 2.3342999999999998
In [236]: data2.view(np.float).reshape(-1,3)[:,1].max()
Out[236]: 2.3342999999999998
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you. A quick question: how do you get the minimum of 0th and 1st column(latitude and longitude) in data2? min(data2[:,0]) and min(data2[:,1]) does not work, strangely.
yeah. By min I meant np.min
Are you taking the min after the view and reshape? I've added several examples of taking max values.
0

Hmm. Maybe you will have luck with this.

f_latlon = latlon.astype(np.float)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.