1

I want to get the column values from DataFrame, which consists of arrays. By using DataFrame.values, the returned dtype is object, what I want is float64.

a=pd.DataFrame({'vector':[np.array([1.1,2,3]),np.array([2.1,3,4])]})
print(a)

b=a['vector'].values
print(b.dtype)
print(b.shape)

c=np.array([i for i in  a['vector']])
print(c.dtype)
print(c.shape)

>>>             vector
>>> 0  [1.1, 2.0, 3.0]
>>> 1  [2.1, 3.0, 4.0]
>>> object
>>> (2,)
>>> float64
>>> (2, 3)

why b and c has different dtype?

c is what I want to get, but is there any better way to get the same result?

2 Answers 2

1

Convert the Series to list and then pass it to np.array i.e

np.array(a['vector'].tolist())

array([[ 1.1,  2. ,  3. ],
   [ 2.1,  3. ,  4. ]])
Sign up to request clarification or add additional context in comments.

Comments

0

According to https://stackoverflow.com/a/33718947/2251785,

numpy.concatenate should works too.

d=np.concatenate(a['vector'].values).reshape(len(a),-1)

Still confused about why .values treats array as object...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.