0

Suppose I have a large dataframe like this:

A                     B      C
27/6/2017 4:00:00   928.04  4.83
27/6/2017 4:20:00   927.71  4.61
27/6/2017 4:40:00   928.22  4.49
27/6/2017 5:00:00   898.74  3.81
27/6/2017 5:20:00   895.16  3.55
27/6/2017 5:40:00   895.05  3.4
27/6/2017 6:00:00   895.68  3.3
27/6/2017 16:20:00  662.45  1.52
27/6/2017 16:40:00  639.98  1.48
27/6/2017 17:40:00  732.02  1.79
27/6/2017 18:00:00  722.63  1.98
27/6/2017 18:20:00  713.26  1.79
27/6/2017 18:40:00  705.8   1.54
27/6/2017 19:00:00  652.1   1.51
27/6/2017 19:20:00  638.58  1.68
27/6/2017 19:40:00  633.14  1.66
27/6/2017 20:00:00  654.66  1.45

I want to split the dataframe on the basis of difference of hours i.e. if the difference between two timestamp is more than 4 hours it will split the dataframe. Then i want to split those two data frames in subgroups on the basis of range of values of B. I want to store all those groups and sub groups in an individual csv files.

Desired output:

Group1:

A                     B      C
27/6/2017 4:00:00   928.04  4.83
27/6/2017 4:20:00   927.71  4.61
27/6/2017 4:40:00   928.22  4.49
27/6/2017 5:00:00   898.74  3.81
27/6/2017 5:20:00   895.16  3.55
27/6/2017 5:40:00   895.05  3.4
27/6/2017 6:00:00   895.68  3.3

Group2:

A                     B      C
27/6/2017 16:20:00  662.45  1.52
27/6/2017 16:40:00  639.98  1.48
27/6/2017 17:40:00  732.02  1.79
27/6/2017 18:00:00  722.63  1.98
27/6/2017 18:20:00  713.26  1.79
27/6/2017 18:40:00  705.8   1.54
27/6/2017 19:00:00  652.1   1.51
27/6/2017 19:20:00  638.58  1.68
27/6/2017 19:40:00  633.14  1.66
27/6/2017 20:00:00  654.66  1.45

Zones:

Group1 Zone1:

A                     B      C
27/6/2017 4:00:00   928.04  4.83
27/6/2017 4:20:00   927.71  4.61
27/6/2017 4:40:00   928.22  4.49

GRoup1 ZOne2:

A                     B      C
27/6/2017 5:00:00   898.74  3.81
27/6/2017 5:20:00   895.16  3.55
27/6/2017 5:40:00   895.05  3.4
27/6/2017 6:00:00   895.68  3.3

LIke this.

I have tried some logics to achieve this but i couldn't able to do this.

Code:

time_diff = df["Time"].diff()

zones = []
dfs = DataFrame

zone = (dfs["Time"] >= (dfs["Time"].shift() + time_diff[1]*12)).cumsum()
zone_grp = dfs.groupby(zone)

xyz = []
for k,g in zone_grp:
    if len(g) >= 30:
        zones.append(g)
    else:
        pass
for m in range(len(zones)):
    zone_df = DataFrame(zones[m])
    x = range(len(zone_df))
    y = zone_df["T401FN1VT4000"]

    abc = Series((linregress(x,y)))
    abc = DataFrame(abc).T
    slope = abc[0].tolist()
    intercept = abc[1].tolist()
    abc = DataFrame({"Slope":slope,"Intercept":intercept})
    xyz.append(abc)
    zone_df.to_csv("Zone_%s.csv" %m, index = False)

xyz = concat(xyz).reset_index()
del xyz["index"]
xyz["Zone"] = xyz.index
xyz = xyz.set_index("Zone")
xyz.to_csv("Coefficients.csv", index = True)

Please help me to split the dataframe on the basis of time difference in a better way and help me to store the groups and sub groups in csv files with different names.

Any help would be appreciated.

0

1 Answer 1

1

You could use diff and pd.Timedelta for the first level groupby, and df.B // x * x to divide B into ranged groups.

grps = [(df.A.diff() > pd.Timedelta(hours=4)).cumsum(), df.B // 100 * 100]
for i, g in df.groupby(grps):
     g.to_csv('{}_{}.csv'.format(*i))
     print(g)

                    A       B     C
3 2017-06-27 05:00:00  898.74  3.81
4 2017-06-27 05:20:00  895.16  3.55
5 2017-06-27 05:40:00  895.05  3.40
6 2017-06-27 06:00:00  895.68  3.30 

                    A       B     C
0 2017-06-27 04:00:00  928.04  4.83
1 2017-06-27 04:20:00  927.71  4.61
2 2017-06-27 04:40:00  928.22  4.49 

                     A       B     C
7  2017-06-27 16:20:00  662.45  1.52
8  2017-06-27 16:40:00  639.98  1.48
13 2017-06-27 19:00:00  652.10  1.51
14 2017-06-27 19:20:00  638.58  1.68
15 2017-06-27 19:40:00  633.14  1.66
16 2017-06-27 20:00:00  654.66  1.45 

                     A       B     C
9  2017-06-27 17:40:00  732.02  1.79
10 2017-06-27 18:00:00  722.63  1.98
11 2017-06-27 18:20:00  713.26  1.79
12 2017-06-27 18:40:00  705.80  1.54 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.