Calculate mean values from pandas dataframe

Question

I am trying to find a good way to calculate mean values from values in a dataframe. It contains measured data from an experiment and is imported from an excel sheet. The columns contain the time passed by, electric current and the corresponding voltage.

The current is changed in steps and then held for some time (the current values vary a little bit, so they are not exactly the same for each step). Now I want to calculate the mean voltage for each current step. Since it takes some time after the voltage gets stable after a step, I also want to leave out the first few voltage values after a step.

Currently I am doing this with loops, but I was wondering wether there is a nicer way with the usage of the groupby function (or others maybe).

Just say if you need more details or clarification.

Example of data:

          s        [A]       [V]
0       6.0  -0.001420  0.780122
1      12.0  -0.002484  0.783297
2      18.0  -0.001478  0.785870
3      24.0  -0.001256  0.793559
4      30.0  -0.001167  0.806086
5      36.0  -0.000982  0.815364
6      42.0  -0.003038  0.825018
7      48.0  -0.001174  0.831739
8      54.0   0.000478  0.838861
9      60.0  -0.001330  0.846086
10     66.0  -0.001456  0.851556
11     72.0   0.000764  0.855950
12     78.0  -0.000916  0.859778
13     84.0  -0.000916  0.859778
14     90.0  -0.001445  0.863569
15     96.0  -0.000287  0.864303
16    102.0   0.000056  0.865080
17    108.0  -0.001119  0.865642
18    114.0  -0.000843  0.866434
19    120.0  -0.000997  0.866809
20    126.0  -0.001243  0.866964
21    132.0  -0.002238  0.867180
22    138.0  -0.001015  0.867177
23    144.0  -0.000604  0.867505
24    150.0   0.000507  0.867571
25    156.0  -0.001569  0.867525
26    162.0  -0.001569  0.867525
27    168.0  -0.001131  0.866756
28    174.0  -0.001567  0.866884
29    180.0  -0.002645  0.867240
..      ...        ...       ...
242  1708.0  24.703866  0.288902
243  1714.0  26.469208  0.219226
244  1720.0  26.468838  0.250437
245  1726.0  26.468681  0.254972
246  1732.0  26.468173  0.271525
247  1738.0  26.468260  0.247282
248  1744.0  26.467666  0.296894
249  1750.0  26.468085  0.247300
250  1756.0  26.468085  0.247300
251  1762.0  26.467808  0.261096
252  1768.0  26.467958  0.259615
253  1774.0  26.467828  0.260871
254  1780.0  28.232325  0.185291
255  1786.0  28.231697  0.197642
256  1792.0  28.231170  0.172802
257  1798.0  28.231103  0.170685
258  1804.0  28.229453  0.184009
259  1810.0  28.230816  0.181833
260  1816.0  28.230913  0.188348
261  1822.0  28.230609  0.178440
262  1828.0  28.231144  0.168507
263  1834.0  28.231144  0.168507
264  1840.0   8.813723  0.641954
265  1846.0   8.814301  0.652373
266  1852.0   8.818517  0.651234
267  1858.0   8.820255  0.637536
268  1864.0   8.821443  0.628136
269  1870.0   8.823643  0.636616
270  1876.0   8.823297  0.635422
271  1882.0   8.823575  0.622253

Output:

              s        [A]       [V]
0    303.000000  -0.000982  0.857416
1    636.000000   0.879220  0.792504
2    699.000000   1.759356  0.752446
3    759.000000   3.519479  0.707161
4    816.000000   5.278372  0.669020
5    876.000000   7.064800  0.637848
6    939.000000   8.828799  0.611196
7    999.000000  10.593054  0.584402
8   1115.333333  12.357359  0.556127
9   1352.000000  14.117167  0.528826
10  1382.000000  15.882287  0.498577
11  1439.000000  17.646748  0.468379
12  1502.000000  19.410817  0.437342
13  1562.666667  21.175572  0.402381
14  1621.000000  22.939826  0.365724
15  1681.000000  24.704600  0.317134
16  1744.000000  26.468235  0.256047
17  1807.000000  28.231037  0.179606
18  1861.000000   8.819844  0.638190

The current approach:

df = df[['s','[A]','[V]']]

#Looping over the rows to separate current points
b=df['[A]'].iloc[0]
start=0
list = []
for index, row in df.iterrows():
    if not math.isclose(row['[A]'], b, abs_tol=1e-02):
        b=row['[A]']
        list.append(df.iloc[start:index])
        start=index
list.append(df.iloc[start:])

#Deleting first few points after each current change
list_b = []
for l in list:
    list_b.append(l.iloc[3:])

#Calculating mean values for each current point
list_c = []
for l in list_b:
    list_c.append(l.mean())

result=pd.DataFrame(list_c)

Possible duplicate of Calculating means for multiple columns, in different rows in pandas — georgeawg
– georgeawg, Commented Aug 7, 2018 at 15:32

Biomage · Accepted Answer · 2018-05-16 09:23:42Z

1

Does this help?

df.groupby(['Columnname', 'Columnname2']).mean()

You may need to create intermediate dataframes for each step. Can you provide an example of the output you want?

answered May 16, 2018 at 9:23

Biomage

3444 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

clel Over a year ago

I have added examples of input and output. I think it might be problematic that the current values of each step are not exactly the same, but vary a bit.

Biomage Over a year ago

Sorry if I am misunderstanding this, but you want a final column with a mean of voltage for each row, correct?

clel Over a year ago

No, I want a new dataset containing the (mean) current of a step and the corresponding mean voltage of each step. Also I don't know what a mean voltage of a row would be. There is only one voltage value per row.

Collectives™ on Stack Overflow

Calculate mean values from pandas dataframe

1 Answer 1

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related