I have a 33620x160 pandas DataFrame which has one column that contains lists of numbers. Each list entry in the DataFrame contains 30 elements.
df['dlrs_col']
0 [0.048142470608688, 0.047021138711858, 0.04573...
1 [0.048142470608688, 0.047021138711858, 0.04573...
2 [0.048142470608688, 0.047021138711858, 0.04573...
3 [0.048142470608688, 0.047021138711858, 0.04573...
4 [0.048142470608688, 0.047021138711858, 0.04573...
5 [0.048142470608688, 0.047021138711858, 0.04573...
6 [0.048142470608688, 0.047021138711858, 0.04573...
7 [0.048142470608688, 0.047021138711858, 0.04573...
8 [0.048142470608688, 0.047021138711858, 0.04573...
9 [0.048142470608688, 0.047021138711858, 0.04573...
10 [0.048142470608688, 0.047021138711858, 0.04573...
I'm creating a 33620x30 array whose entries are the unlisted values from that single DataFrame column. I'm currently doing this as:
np.array(df['dlrs_col'].tolist(), dtype = 'float64')
This works just fine, but it takes a significant amount of time, especially when considering I do a similar calculation for 6 additional columns of lists. Any ideas on how I can speed this up?