0

I'm trying to change my DataFrame's values like this: df['Tokens'] = tokens Where tokens is a 2-d np.array. I expected to have a column, where each element is a 1-d np.array, but found out, that each element took only first element of a correspoding 1-d array. Is there a way to store arrays in DataFrame's elements?

1 Answer 1

3

Is that what you want?

In [26]: df = pd.DataFrame(np.random.rand(5,2), columns=list('ab'))

In [27]: df
Out[27]:
          a         b
0  0.513723  0.886019
1  0.197956  0.172094
2  0.131495  0.476552
3  0.678821  0.106523
4  0.440118  0.802589

In [28]: arr = df.values

In [29]: arr
Out[29]:
array([[ 0.51372311,  0.88601887],
       [ 0.19795635,  0.17209383],
       [ 0.13149478,  0.47655197],
       [ 0.67882124,  0.10652332],
       [ 0.44011802,  0.80258924]])

In [30]: df['c'] = arr.tolist()

In [31]: df
Out[31]:
          a         b                                           c
0  0.513723  0.886019    [0.5137231110962795, 0.8860188692834928]
1  0.197956  0.172094  [0.19795634688449892, 0.17209383434042336]
2  0.131495  0.476552  [0.13149477867656167, 0.47655196508193576]
3  0.678821  0.106523   [0.6788212365523125, 0.10652331756477551]
4  0.440118  0.802589   [0.44011802077658635, 0.8025892383754725]

Timing for 5M rows DF:

In [36]: big = pd.concat([df] * 10**6, ignore_index=True)

In [38]: big.shape
Out[38]: (5000000, 2)

In [39]: arr = big.values

In [40]: %timeit arr.tolist()
1 loop, best of 3: 2.27 s per loop

In [41]: %timeit list(arr)
1 loop, best of 3: 3.62 s per loop
Sign up to request clarification or add additional context in comments.

4 Comments

perhaps df.values.tolist() will be better: stackoverflow.com/a/40593934/3765319
@Kartik, it's a good point, thank you!! I've corrected my answer
Out of curiosity, is df.as_matrix()[:, :2] in a lamba function worthwhile or would the list approach work best?
@anshanno, it's an interesting idea! Could you please add it as your own answer?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.