How to convert a Pandas Dataframe into 3D numpy array?

Question

I have a table where the columns are ['datetime', 'sensorid', 'sms-in', 'sms-out', 'call-in', 'call-out'], and there are 10,000 sensors in total. Ideally, there will be 10,000 rows for each timestamp. However, there may be missing rows of sensors for some timestamps (e.g., only 9998 rows).

The table may look like

                                sms-in   sms-out   call-in  call-out  
datetime            sensorid                                           
2013-10-31 23:00:00 1         0.223227  0.156787  0.160938  0.052275   
                    2         0.222201  0.147617  0.164946  0.054712   
                    3         0.221109  0.137855  0.169213  0.057306   
                    4         0.226198  0.183349  0.149327  0.045216   
                    5         0.205065  0.175393  0.139139  0.043455   
...                                ...       ...       ...       ...   
2013-11-01 22:50:00 9996      0.695404  0.440369  0.087566  0.310581   
                    9997      0.687958  0.429974  0.085995  0.243143   
                    9998      0.687958  0.429974  0.085995  0.256862   
                    9999      0.894907  0.518741  0.085995  0.230476   
                    10000     1.212911  0.638219  0.085995  0.090769   

[1439982 rows x 4 columns]

Let the last 4 columns ['sms-in', 'sms-out', 'call-in', 'call-out'] be the features of a sensor. Let T and N represent the timestamp and sensorid axies, respectively.

How do I convert the DataFrame into a numpy array with the shape of (T, N, 4)? I tried a very trival way to iteratively collect the rows, which is very inefficient. Is there any Pandas API or concise way to do a work like that?

read notice : minimal reproducible example. Please provide the complete code without any omissions. If it's a MultiIndex, please provide the code that can generate it. — Panda Kim
– Panda Kim, Commented Mar 17, 2024 at 11:53
and check this post : how-to-convert-a-pandas-multiindex-dataframe-into-a-3d-array — Panda Kim
– Panda Kim, Commented Mar 17, 2024 at 12:06

patoba · Accepted Answer · 2024-03-17 23:16:52Z

0

Imagining that your dataframe is called df. You can do the following:

array = df.values.reshape(T, N, 4)

answered Mar 17, 2024 at 23:16

patoba

723 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to convert a Pandas Dataframe into 3D numpy array?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related