3

I have a .csv file like this:

Receipt ID Name Quantity Category Type
135135 Croissant 1.0 Food
135135 Cappucino 1.0 Drink
143143 Salad 1.0 Food
154134 Americano 1.0 Drink
178781 Cappucino 1.0 Drink
169071 Muffin 1.0 Food
169071 Latte 1.0 Drink
169071 Brownie 1.0 Food

I want to get the Receipt IDs where the Category Type is "Food".

I've tried a few methods but none of them work. For example,

df1 = df.query('Category Type == Food')['Receipt ID'].unique()

I've also tried setting Category Type as index:

df1 = df.set_index('Category Type').eq('Food')
print (df1.index[df1['Receipt ID']].tolist())

which gave me an empty list.

The Receipt IDs are not necessarily unique, although I want the outputs to be unique, and the final goal is to find the Receipt ID that contains both "Food" and "Drink". Could any expert please give me some help? Thank you!

3 Answers 3

5
df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().values.tolist()

if you want unique:

df.where(df['Category Type'] == 'Food')['Receipt ID'].dropna().unique().astype(int).tolist()

or

df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()

for all types:

df.groupby('Category Type').agg({'Receipt ID': 'unique'}).to_dict()
Sign up to request clarification or add additional context in comments.

Comments

0

import pandas as pd
from io import StringIO

data_str = """
Receipt ID  Name    Quantity    Category Type
135135  Croissant   1.0 Food
135135  Cappucino   1.0 Drink
143143  Salad   1.0 Food
154134  Americano   1.0 Drink
178781  Cappucino   1.0 Drink
169071  Muffin  1.0 Food
169071  Latte   1.0 Drink
169071  Brownie 1.0 Food
"""
# This is myself organizing the data, you can skip it here
io_str = StringIO(data_str)
df = pd.read_csv(io_str, header=0, sep='\t')

# start here
# method 1 
filter_df = df[df['Category Type'] == 'Food']
unique_list = filter_df['Receipt ID'].unique().tolist()
print(unique_list)

# method 2 use loc function
unique_list=df.loc[df['Category Type'] == 'Food', 'Receipt ID'].unique().tolist()
print(unique_list)
"""
[135135, 143143, 169071]
"""


2 Comments

Thank you! I never used StringIO before and am looking it up now
You can ignore it and deal with your own concerns first
0
cond = df['Category Type'] == 'Food'
df[cond]['Receipt ID'].unique().tolist()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.