Pandas/ Python list values of one column based on string value of another column

Question

I have a .csv file like this:

Receipt ID	Name	Quantity	Category Type
135135	Croissant	1.0	Food
135135	Cappucino	1.0	Drink
143143	Salad	1.0	Food
154134	Americano	1.0	Drink
178781	Cappucino	1.0	Drink
169071	Muffin	1.0	Food
169071	Latte	1.0	Drink
169071	Brownie	1.0	Food

I want to get the Receipt IDs where the Category Type is "Food".

I've tried a few methods but none of them work. For example,

df1 = df.query('Category Type == Food')['Receipt ID'].unique()

I've also tried setting Category Type as index:

df1 = df.set_index('Category Type').eq('Food')
print (df1.index[df1['Receipt ID']].tolist())

which gave me an empty list.

The Receipt IDs are not necessarily unique, although I want the outputs to be unique, and the final goal is to find the Receipt ID that contains both "Food" and "Drink". Could any expert please give me some help? Thank you!

MoRe · Accepted Answer · 2022-07-28 23:32:57Z

5

df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().values.tolist()

if you want unique:

df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().unique().astype(int).tolist()

or

df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()

for all types:

df.groupby('Category Type').agg({'Receipt ID': 'unique'}).to_dict()

edited Jul 28, 2022 at 23:32

answered Jul 28, 2022 at 23:27

MoRe

2,3942 gold badges5 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Xin · Accepted Answer · 2022-07-29 06:13:01Z

0


import pandas as pd
from io import StringIO

data_str = """
Receipt ID  Name    Quantity    Category Type
135135  Croissant   1.0 Food
135135  Cappucino   1.0 Drink
143143  Salad   1.0 Food
154134  Americano   1.0 Drink
178781  Cappucino   1.0 Drink
169071  Muffin  1.0 Food
169071  Latte   1.0 Drink
169071  Brownie 1.0 Food
"""
# This is myself organizing the data, you can skip it here
io_str = StringIO(data_str)
df = pd.read_csv(io_str, header=0, sep='\t')

# start here
# method 1 
filter_df = df[df['Category Type'] == 'Food']
unique_list = filter_df['Receipt ID'].unique().tolist()
print(unique_list)

# method 2 use loc function
unique_list=df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()
print(unique_list)
"""
[135135, 143143, 169071]
"""

answered Jul 29, 2022 at 6:13

Xin

224 bronze badges

2 Comments

qyy Over a year ago

Thank you! I never used StringIO before and am looking it up now

Xin Over a year ago

You can ignore it and deal with your own concerns first

Sanju Halder · Accepted Answer · 2022-07-30 04:51:20Z

0

cond = df['Category Type'] == 'Food'
df[cond]['Receipt ID'].unique().tolist()

answered Jul 30, 2022 at 4:51

Sanju Halder

967 bronze badges

Collectives™ on Stack Overflow

Pandas/ Python list values of one column based on string value of another column

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related