2

Using Pandas I've invoked groupby on my dataframe and obtained the following:

>>>grouped = df.groupby(['cid'])
for key, gr in grouped:
        print(key)
        print(gr)
Out: cid  price
     121  12
     121  10
     121  9

I want to have each group pivoted like:

cid price1 price2 price3
121     12     10      9

What is the correct way to do this with Pandas?

6
  • Are you after one large DataFrame of cid with columns of price1-priceN or just the prices per each group? Commented Jun 10, 2015 at 13:16
  • The original dataframe has two columns: cids and prices Commented Jun 10, 2015 at 13:34
  • I get that part. Do you want the prices per group, or per the whole frame? eg... if the longest group has 1000 prices, do you want a dataframe of 1000 columns with NaNs for the cids that only had 5/6/whatever prices, or are you just wanting it on a group by group basis... Commented Jun 10, 2015 at 13:37
  • 1
    It's good to have small sample datasets but this is actually too small. Would be good to have at least 2 different values for 'cid'. Regardless, you'll likely want to use either stack() or pivot(). It would be nice to see what you have already tried, if anything. Commented Jun 10, 2015 at 14:36
  • 1
    Agreeing with and extending JohnE's comment above - you need to show raw output for what is in df for us to help you here. I can't see exactly how that Out can come from the code you post - it should print the value of key (i.e. 121) first. Make a dataframe with two cid values, show the output of print(df) for that and show exactly what output you want for it. Commented Jun 10, 2015 at 15:09

2 Answers 2

5

Assuming you have a frame looking like

>>> df = pd.DataFrame({"cid": np.arange(64)//8, "price": np.arange(64)})
>>> df.head()
   cid  price
0    0      0
1    0      1
2    0      2
3    0      3
4    0      4

Then I think you can get what you want by combining groupby and pivot:

df = pd.DataFrame({"cid": np.arange(64)//8, "price": np.arange(64)})
df["num"] = df.groupby("cid")["price"].cumcount() + 1
pivoted = df.pivot(index="cid", columns="num", values="price")
pivoted.columns = "price" + pivoted.columns.astype(str)
pivoted = pivoted.reset_index()

which gives

>>> pivoted
   cid  price1  price2  price3  price4  price5  price6  price7  price8
0    0       0       1       2       3       4       5       6       7
1    1       8       9      10      11      12      13      14      15
2    2      16      17      18      19      20      21      22      23
3    3      24      25      26      27      28      29      30      31
4    4      32      33      34      35      36      37      38      39
5    5      40      41      42      43      44      45      46      47
6    6      48      49      50      51      52      53      54      55
7    7      56      57      58      59      60      61      62      63

Aside: sticking numbers after the end of strings, e.g. "price5", is usually not a good idea. You can't really work with them, they don't sort the way you'd expect, etc.


First, we create a column showing what index something is in the price:

>>> df["num"] = df.groupby("cid")["price"].cumcount() + 1
>>> df.head(10)
   cid  price  num
0    0      0    1
1    0      1    2
2    0      2    3
[etc.]
7    0      7    8
8    1      8    1
9    1      9    2

Then we pivot:

>>> pivoted = df.pivot(index="cid", columns="num", values="price")
>>> pivoted
num   1   2   3   4   5   6   7   8
cid                                
0     0   1   2   3   4   5   6   7
1     8   9  10  11  12  13  14  15
2    16  17  18  19  20  21  22  23
3    24  25  26  27  28  29  30  31
4    32  33  34  35  36  37  38  39
5    40  41  42  43  44  45  46  47
6    48  49  50  51  52  53  54  55
7    56  57  58  59  60  61  62  63

Then we fix the columns:

>>> pivoted.columns = "price" + pivoted.columns.astype(str)
>>> pivoted
     price1  price2  price3  price4  price5  price6  price7  price8
cid                                                                
0         0       1       2       3       4       5       6       7
1         8       9      10      11      12      13      14      15
2        16      17      18      19      20      21      22      23
3        24      25      26      27      28      29      30      31
4        32      33      34      35      36      37      38      39
5        40      41      42      43      44      45      46      47
6        48      49      50      51      52      53      54      55
7        56      57      58      59      60      61      62      63

And finally we reset the index:

>>> pivoted = pivoted.reset_index()
>>> pivoted
   cid  price1  price2  price3  price4  price5  price6  price7  price8
0    0       0       1       2       3       4       5       6       7
1    1       8       9      10      11      12      13      14      15
2    2      16      17      18      19      20      21      22      23
3    3      24      25      26      27      28      29      30      31
4    4      32      33      34      35      36      37      38      39
5    5      40      41      42      43      44      45      46      47
6    6      48      49      50      51      52      53      54      55
7    7      56      57      58      59      60      61      62      63
Sign up to request clarification or add additional context in comments.

Comments

2

Here's a quick variation on @DSM's approach, using unstack(). I'll borrow @DSM's sample data, to keep it easy to compare results from pivot() vs unstack():

>>> df = pd.DataFrame({"cid": np.arange(64)//8, "price": np.arange(64)})
>>> df['num'] = df.groupby('cid').cumcount()
>>> df.set_index(['cid','num']).unstack()

    price                            
num     0   1   2   3   4   5   6   7
cid                                  
0       0   1   2   3   4   5   6   7
1       8   9  10  11  12  13  14  15
2      16  17  18  19  20  21  22  23
3      24  25  26  27  28  29  30  31
4      32  33  34  35  36  37  38  39
5      40  41  42  43  44  45  46  47
6      48  49  50  51  52  53  54  55
7      56  57  58  59  60  61  62  63

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.