1

I have a Dataframe of raw data:

df
Out: 
              Date_time   10a   10b   10c   40a   40b   40c  100a  100b  100c
120 2019-02-04 16:00:00  26.7  26.9   NaN  26.7   NaN   NaN  24.9   NaN   NaN
121 2019-02-04 17:00:00  23.4  24.0  23.5  24.3  24.1  24.0  25.1  24.8  25.1
122 2019-02-04 18:00:00  23.1  24.0  23.3  24.3  24.1  24.0  25.1  24.8  25.1
123 2019-02-04 19:00:00  22.8  23.8  22.9  24.3  24.1  24.0  25.1  24.8  25.1
124 2019-02-04 20:00:00   NaN  23.5  22.6  24.3  24.1  24.0  25.1  24.8  25.1

I wish to create a DataFrame containing the 'Date_time' column and several columns of data means. In this instance there will be 3 means for each row, one each for 10, 40, and 100, calculating the mean values for a, b, and c for each of these numbered intervals.

means
Out: 
              Date_time       10cm       40cm  100cm
120 2019-02-04 16:00:00  26.800000  26.700000   24.9
121 2019-02-04 17:00:00  23.633333  24.133333   25.0
122 2019-02-04 18:00:00  23.466667  24.133333   25.0
123 2019-02-04 19:00:00  23.166667  24.133333   25.0
124 2019-02-04 20:00:00  23.050000  24.133333   25.0

I have tried the following (taken from this answer):

means = df['Date_time'].copy()

means['10cm'] = df.loc[:, '10a':'10c'].mean(axis=1)

But this results in all the mean values being clumped together in one cell at the bottom of the 'Date_time' column with '10cm' being given as the cell's index.

means
Out: 
120                                   2019-02-04 16:00:00
121                                   2019-02-04 17:00:00
122                                   2019-02-04 18:00:00
123                                   2019-02-04 19:00:00
124                                   2019-02-04 20:00:00
10cm    120    26.800000
121    23.633333
122    23.46...
Name: Date_time, dtype: object

I believe that this is something to do with means being a Series object rather that a DataFrame object when I copy across the 'Date_time' column, but I'm not sure. Any pointers would be greatly appreciated!

enter image description here

1 Answer 1

2

It was the Series issue. Turns out writing out the question helped me realise the issue! My solution was altering the initial creation of means using to_frame():

means = df['Date_time'].copy().to_frame()

I'll leave the question up in case anyone else is having a similar issue, to save them having to spend time writing it all up!

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.