0

How could I join the rows based on the resume column?
I mean, go joining the rows until in resume column there is a 1.
For joined rows I want to use an aggregate function for each column, something like that:

{ 'tunein': 'first', 'tuneout': 'last', 'watching(minute)': 'sum' }

Example: from this dataframe

  account_id    asset_id     tunein      tuneout resume watching(minute)
1   61745         2967      12:42:00    13:01:00    0   19.0
2   61745         2967      15:48:00    15:51:00    1   3.0
3   61745         2967      15:52:00    15:56:00    1   4.0
4   61745         2967      15:57:00    16:23:00    0   26.0
5   61745         2967      21:06:00    21:10:00    0   4.0
6   61745         2967      22:17:00    22:37:00    1   20.0
7   61745         2967      23:55:00    23:58:00    1   3.0

get this:

  account_id    asset_id     tunein      tuneout resume watching(minute)
1   61745         2967      12:42:00    15:56:00    0   26.0
2   61745         2967      15:57:00    16:23:00    0   26.0
3   61745         2967      21:06:00    23:58:00    0   27.0

1 Answer 1

1

Let's create groups with cumsum based on where there are 0 in the resume column:

df = (
    df.groupby(df['resume'].eq(0).cumsum())
        .agg({'account_id': 'first', 'asset_id': 'first',
              'tunein': 'first', 'tuneout': 'last', 'resume': 'first',
              'watching(minute)': 'sum'})
        .rename_axis(None)
)

The dictionary can be programmatically created as well:

d = {c: 'first' for c in df.columns}  # default 'first'
d['tuneout'] = 'last'  # set specific columns
d['watching(minute)'] = 'sum'
df = df.groupby(df['resume'].eq(0).cumsum()).agg(d).rename_axis(None)

Both options produce df:

   account_id  asset_id    tunein   tuneout  resume  watching(minute)
1       61745      2967  12:42:00  15:56:00       0              26.0
2       61745      2967  15:57:00  16:23:00       0              26.0
3       61745      2967  21:06:00  23:58:00       0              27.0

How groups are created:

df['resume'].eq(0).cumsum()
0    1
1    1
2    1
3    2
4    3
5    3
6    3
Name: resume, dtype: int32
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.