Python PANDAS: Applying a function to a dataframe, with arguments defined within dataframe

Question

I have a dataframe with headers 'Category', 'Factor1', 'Factor2', 'Factor3', 'Factor4', 'UseFactorA', 'UseFactorB'.

The value of 'UseFactorA' and 'UseFactorB' are one of the strings ['Factor1', 'Factor2', 'Factor3', 'Factor4'], keyed based on the value in 'Category'.

I want to generate a column, 'Result', which equals dataframe[UseFactorA]/dataframe[UseFactorB]

Take the below dataframe as an example:

[Category] [Factor1] [Factor2] [Factor3] [Factor4] [useFactor1] [useFactor2]
     A         1        2         5           8     'Factor1'    'Factor3'
     B         2        7         4           2     'Factor3'    'Factor1'

The 'Result' series should be [2, .2]

However, I cannot figure out how to feed the value of useFactor1 and useFactor2 into an index to make this happen--if the columns to use were fixed, I would just give

df['Result'] = df['Factor1']/df['Factor2']

However, when I try to give

df['Results'] = df[df['useFactorA']]/df[df['useFactorB']]

I get the error

ValueError: Wrong number of items passed 3842, placement implies 1

Is there a method for doing what I am trying here?

it's-yer-boy-chet · Accepted Answer · 2019-04-11 00:01:24Z

1

Probably not the prettiest solution (because of the iterrows), but what comes to mind is to iterate through the sets of factors and set the 'Result' value at each index:

for i, factors in df[['UseFactorA', 'UseFactorB']].iterrows():
    df.loc[i, 'Result'] = df[factors['UseFactorA']] / df[factors['UseFactorB']]

Edit:

Another option:

def factor_calc_for_row(row):
    factorA = row['UseFactorA']
    factorB = row['UseFactorB']
    return row[factorA] / row[factorB]

df['Result'] = df.apply(factor_calc_for_row, axis=1)

edited Apr 11, 2019 at 0:01

answered Apr 10, 2019 at 23:51

it's-yer-boy-chet

2,0462 gold badges14 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Michael Schweitzer Over a year ago

Function worked perfectly, and very simply! I tried something similar, but left out the axis=1 argument--thank you very much.

Ben Pap · Accepted Answer · 2019-04-11 01:16:31Z

1

Here's the one liner:

df['Results'] = [df[df['UseFactorA'][x]][x]/df[df['UseFactorB'][x]][x] for x in range(len(df))]

How it works is:

df[df['UseFactorA']]

Returns a data frame,

df[df['UseFactorA'][x]]

Returns a Series

df[df['UseFactorA'][x]][x]

Pulls a single value from the series.

edited Apr 11, 2019 at 1:16

answered Apr 11, 2019 at 0:01

Ben Pap

2,5791 gold badge10 silver badges17 bronze badges

Collectives™ on Stack Overflow

Python PANDAS: Applying a function to a dataframe, with arguments defined within dataframe

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related