35

As part of broader program I am working on, I ended up with object arrays with strings, 3D coordinates and etc all mixed. I know object arrays might not be very favorite in comparison to structured arrays but I am hoping to get around this without changing a lot of codes.

Lets assume every row of my array obj_array (with N rows) has format of

Single entry/object of obj_array:  ['NAME',[10.0,20.0,30.0],....] 

Now, I am trying to load this object array and slice the 3D coordinate chunk. Up to here, everything works fine with simply asking lets say for .

obj_array[:,[1,2,3]]

However the result is also an object array and I will face problem as I want to form a 2D array of floats with:

size [N,3] of N rows and 3 entries of X,Y,Z coordinates

For now, I am looping over rows and assigning every row to a row of a destination 2D flot array to get around the problem. I am wondering if there is any better way with array conversion tools of numpy ? I tried a few things and could not get around it.

Centers   = np.zeros([N,3])

for row in range(obj_array.shape[0]):
    Centers[row,:] = obj_array[row,1]

Thanks

1
  • Can you show a simple example code - what the original data looks like, and what your conversion code looks like? It will make it easier for someone to give you appropriate advice. Commented Oct 18, 2013 at 21:07

8 Answers 8

28

Nasty little problem... I have been fooling around with this toy example:

>>> arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)
>>> arr
array([['one', [1, 2, 3]],
       ['two', [4, 5, 6]]], dtype=object)

My first guess was:

>>> np.array(arr[:, 1])
array([[1, 2, 3], [4, 5, 6]], dtype=object)

But that keeps the object dtype, so perhaps then:

>>> np.array(arr[:, 1], dtype=float)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.

You can normally work around this doing the following:

>>> np.array(arr[:, 1], dtype=[('', float)]*3).view(float).reshape(-1, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: expected a readable buffer object

Not here though, which was kind of puzzling. Apparently it is the fact that the objects in your array are lists that throws this off, as replacing the lists with tuples works:

>>> np.array([tuple(j) for j in arr[:, 1]],
...          dtype=[('', float)]*3).view(float).reshape(-1, 3)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Since there doesn't seem to be any entirely satisfactory solution, the easiest is probably to go with:

>>> np.array(list(arr[:, 1]), dtype=float)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])

Although that will not be very efficient, probably better to go with something like:

>>> np.fromiter((tuple(j) for j in arr[:, 1]), dtype=[('', float)]*3,
...             count=len(arr)).view(float).reshape(-1, 3)
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.]])
Sign up to request clarification or add additional context in comments.

3 Comments

I am the only one that does not understand why would np.array(arr[:, 1], dtype=np.float) not work?
@Syzygy np.float is deprecated. dtype=float will work no problem
Nope... arr[:1] is a 1D array that has 3 element lists as items. Numpy does not know how to convert a list either to a float or an np.float, hence the error. You have to convert it to a list of lists, i.e. list(arr[:, 1]) then NumPy will look at the structure of the nested lists and convert it into a 2D array.
13

Based on Jaime's toy example I think you can do this very simply using np.vstack():

arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)
float_arr = np.vstack(arr[:, 1]).astype(np.float)

This will work regardless of whether the 'numeric' elements in your object array are 1D numpy arrays, lists or tuples.

Comments

6

This works great working on your array arr to convert from an object to an array of floats. Number processing is extremely easy after. Thanks for that last post!!!! I just modified it to include any DataFrame size:

float_arr = np.vstack(arr[:, :]).astype(np.float)

1 Comment

This is more a comment than an answer. Does it relates to the answer of ali_m ?
3

This is way faster to just convert your object array to a NumPy float array: arr=np.array(arr, dtype=[('O', np.float)]).astype(np.float) - from there no looping, index it just like you'd normally do on a NumPy array. You'd have to do it in chunks though with your different datatypes arr[:, 1], arr[:,2], etc. Had the same issue with a NumPy tuple object returned from a C++ DLL function - conversion for 17M elements takes <2s.

2 Comments

If anyone actually tried this solution above and profiled it it would not be downvoted.
I get "ValueError: setting an array element with a sequence." error with this approach.. Example: aa = [['5236', [1,2,0.3]], ['63734', [6,1.5,0.0]]] bb = np.array(aa, dtype='object') arr=np.array(bb[:,1], dtype=[('O', np.float)]).astype(np.float)
1

You may want to use structured array, so that when you need to access the names and the values independently you can easily do so. In this example, there are two data points:

x = zeros(2, dtype=[('name','S10'), ('value','f4',(3,))])
x[0][0]='item1'
x[1][0]='item2'
y1=x['name']
y2=x['value']

the result:

>>> y1
array(['item1', 'item2'], 
      dtype='|S10')
>>> y2
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]], dtype=float32)

See more details: http://docs.scipy.org/doc/numpy/user/basics.rec.html

Comments

1

This problem usually happens when you have a dataset with different types, usually, dates in the first column or so.

What I use to do, is to store the date column in a different variable; and take the rest of the "X matrix of features" into X. So I have dates and X, for instance.

Then I apply the conversion to the X matrix as:

X = np.array(list(X[:,:]), dtype=np.float)

Hope to help!

Comments

1

For structured arrays use

structured_to_unstructured(arr).astype(np.float)

See: https://numpy.org/doc/stable/user/basics.rec.html#numpy.lib.recfunctions.structured_to_unstructured

Comments

1

np.array(list(arr), dtype=np.float) would work to convert all the elements in array to float at once.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.