Looping over a string list in Python

Question

I have a list which consists of a different colours, all stored as string variables.

Preferredcolours = ['red','yellow','green', 'blue']

I have a panda array, which contains information about cars. One of the column DfCar['colour'] consists of the colours of these cars.  I want to create a new variable in my data frame, column named PreferredMathcing which =1 if the DataFrame colour column matches with one of the list colours. How can I use a for loop to solve this?

I would ideally want this sort of a solution:

+=================+============================+
| DfCar['colour'] | DfCar['PreferredMathcing'] |
+=================+============================+
| white           |                          0 |
+-----------------+----------------------------+
| yellow          |                          1 |
+-----------------+----------------------------+
| black           |                          0 |
+-----------------+----------------------------+
| purple          |                          0 |
+-----------------+----------------------------+
| green           |                          1 |
+-----------------+----------------------------+

df['PreferredMatching'] = df[df.colour.isin(PreferredColours)].astpye(int) — DJK
– DJK, Commented Jun 24, 2019 at 12:56
@Saif you got a lot of working solutions, if your data is big I suggest you benchmark them and choose the one that performs best... from my experience, using apply(...) for simple stuff can take x20 - x30 times more then a dedicated function. that is - half an hour instead of 1m, or a full day instead of 1h... — Adam.Er8
– Adam.Er8, Commented Jun 24, 2019 at 12:56

Adam.Er8 · Accepted Answer · 2019-06-24 13:08:46Z

you can use .isin(), which returns a Series with True/False for each row based on if it is in a list of values. then use .astype(int) to get your 1/0 instead.

try this:

import pandas as pd
import numpy as np

df = pd.DataFrame.from_dict({'colour': ['white', 'yellow', 'black', 'purple', 'green']})
Preferredcolours = ['red','yellow','green', 'blue']

df["PreferredMathcing"] = df['colour'].isin(Preferredcolours).astype(int)

print(df)

output:

   colour  PreferredMathcing
0   white                  0
1  yellow                  1
2   black                  0
3  purple                  0
4   green                  1

NOTE:

choosing a solution with a pure library function will likely out-perform a solution using apply with custom python logic.

bench-marking those against each other on my machine suggests .isin() is almost x8 faster:

with '.isin()': 1.0591506958007812
with '.apply()': 8.234664678573608
ratio: 7.774780974248154

Suresh Mali · Accepted Answer · 2019-06-24 12:48:31Z

1

following will give you output

def check_colour(x, Preferredcolours) :
    return 1 if x['colour'] in Preferredcolours else 0

dfCar['PreferredMathcing'] = df.apply(check_colour,args=(Preferredcolours,), axis=1)

answered Jun 24, 2019 at 12:48

Suresh Mali

3481 gold badge7 silver badges19 bronze badges

Comments

Wytamma Wirth · Accepted Answer · 2019-06-24 12:52:27Z

1

You can use np.where like below:

import pandas as pd
import numpy as np

DfCar = pd.DataFrame.from_dict({'colour': ['white', 'yellow', 'black', 'purple', 'green']})
Preferredcolours = ['red','yellow','green', 'blue']

DfCar['PreferredMathcing'] = np.where(DfCar['colour'].isin(Preferredcolours), 1, 0)

edited Jun 24, 2019 at 12:52

answered Jun 24, 2019 at 12:46

Wytamma Wirth

5633 silver badges12 bronze badges

Comments

dustin-we · Accepted Answer · 2019-06-24 12:49:27Z

0

Assuming DfCar is your Dataframe.

Preferredcolours = ['red','yellow','green', 'blue']    
DfCar['PreferredMatching'] = DfCar['colour'].apply(lambda x: x in Preferredcolours)

This will apply the lambda function over every element in your "colour" column. Simply check if it is in "preferredcolours" and return True or False.

answered Jun 24, 2019 at 12:49

dustin-we

4962 silver badges7 bronze badges

Collectives™ on Stack Overflow

Looping over a string list in Python

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related