1

I have dataframe like this

    id             Date
    546451991   2018-07-31 00:00:00
    546451991   2018-08-02 00:00:00
    5441440119  2018-08-13 00:00:00
    5441440119  2018-08-13 00:00:00
    5441440119  2018-08-14 00:00:00
    5344265358  2018-07-13 00:00:00
    5344265358  2018-07-15 00:00:00
    5441438884  2018-07-19 00:00:00

I want to groupby 'ID' then sort on the basis of date then add a column containing date of next ROW

E.g i want output like this

 id             Date              Date1
546451991   2018-07-31 00:00:00  2018-08-02 00:00:00
546451991   2018-08-02 00:00:00  NULL
5441440119  2018-08-13 00:00:00  2018-08-14 00:00:00
5441440119  2018-08-14 00:00:00  2018-08-15 00:00:00
5441440119  2018-08-15 00:00:00  NULL
5344265358  2018-07-13 00:00:00  2018-07-15 00:00:00
5344265358  2018-07-15 00:00:00  NULL
5441438884  2018-07-19 00:00:00  NULL

i have tried but not succeeded df.groupby('id')['Date'].sort_values() not working

3
  • I think you mean "next row" not "next column"? Commented Sep 8, 2018 at 10:02
  • You're not clear about what kind of sorting you would like to do in your df, is it ascending or descending? Commented Sep 8, 2018 at 10:10
  • Can you please post a pre-made DF? I'm faffing around now for quite a while trying to get this into a testable format from the clipboard, when I should be trying to answer the question Commented Sep 8, 2018 at 10:13

2 Answers 2

2
df['Date1'] = df.groupby('id')['Date'].apply(lambda x: x.sort_values().shift(-1))

Out:

            Date           id          Date1
0   2018-07-3100:00:00  546451991   2018-08-0200:00:00
1   2018-08-0200:00:00  546451991   NaN
2   2018-08-1300:00:00  5441440119  2018-08-1300:00:00
3   2018-08-1300:00:00  5441440119  2018-08-1400:00:00
4   2018-08-1400:00:00  5441440119  NaN
5   2018-07-1300:00:00  5344265358  2018-07-1500:00:00
6   2018-07-1500:00:00  5344265358  NaN
7   2018-07-1900:00:00  5441438884  NaN

edit

from sandeep inputs

df['Date1'] = df.groupby('id')['Date'].shift(-1)
Sign up to request clarification or add additional context in comments.

5 Comments

i tried your code but it is giving me following error TypeError: incompatible index of inserted column with frame index
I checked the code again, it is giving the same results. are you using id as index as index ?
No but my index are not in increasing order values are missing e.g 1,2,4,7,8,9,10
please reset the index <b> df.reset_index(drop = True,inplace = True) </b>
Yes @SandeepKadapa, i didnt think of that ;-)
0

This is probably what you're looking for, while @Naga Kiran's answer does it in one liner, I'm just making things simple step by step.

import pandas as pd
df = pd.DataFrame({"id":[1, 2, 3, 4], "Date":["2018-07-01", "2018-08-01", "2018-09-02", "2018-10-03"]})
newdf = df.sort_values(["Date"], ascending=False)
newdf["Date1"] = newdf["Date"].transform(lambda x:x.shift(-1))
newdf.groupby("id").head(3)

I first sorted the dataframe, then added the Date1 with shift(-1) which shift the column value in one row up, then did the groupby("id").

Hope this helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.