0

Given a toy dataset as follow which has duplicated price and quantity:

  city      item value
0   bj     price    12
1   bj  quantity    15
2   bj     price    12
3   bj  quantity    15
4   bj     level     a
5   sh     price    45
6   sh  quantity    13
7   sh     price    56
8   sh  quantity     7
9   sh     level     b

I want to reshape it into the following dataframe, which means add sell_ for the first pair and buy_ for the second pair:

  city  sell_price  sell_quantity  buy_price  buy_quantity level
0   bj          12             15         13            16     a
1   sh          45             13         56             7     b

I have tried with df.set_index(['city', 'item']).unstack().reset_index(), but it raises an error: ValueError: Index contains duplicate entries, cannot reshape.

How could I get the desired output as above? Thanks.

4
  • so, what do you want to do with the duplicates? additionally your output does not match your input dataframe, how does level_a, for bj get the value of 13 ? Commented Aug 5, 2020 at 10:11
  • by adding prefix or suffix to column names after reshaped. Commented Aug 5, 2020 at 10:12
  • maybe we need to use pivot-table or unstack? Commented Aug 5, 2020 at 10:13
  • Sorry, i don't get it, for non-duplicated one, just keep it, for bj and level, the value is a. Commented Aug 5, 2020 at 10:15

1 Answer 1

3

You can add for second duplicated values buy_ and for first duplicates sell_ and change values in item before your solution:

m1 = df.duplicated(['city', 'item'])
m2 = df.duplicated(['city', 'item'], keep=False)

df['item'] = np.where(m1, 'buy_', np.where(m2, 'sell_', '')) + df['item']

df = (df.set_index(['city', 'item'])['value']
        .unstack()
        .reset_index()
        .rename_axis(None, axis=1))

#for change order of columns names
df = df[['city','sell_price','sell_quantity','buy_price','buy_quantity','level']]
print (df)
  city sell_price sell_quantity buy_price buy_quantity level
0   bj         12            15        12           15     a
1   sh         45            13        56            7     b
Sign up to request clarification or add additional context in comments.

2 Comments

Maybe it needs to drop item for your final df, is it index?
yop, it is index name. Give me a sec.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.