1

How to define a Python function that takes in input a DataFrame df a string str that is one of its column names, so that

def fun(x,col):
    return x.loc[x.col == 0]

I know this is quite pleonastic but it is didactical. Is it possible to use variables for dataframe columns? (apparently not...)

The following code does not work

df = pandas.DataFrame({'Name': ...list of Irish people names..., 'Height':.... list of people's height)

x = 'Height'

I saw the solution to the question in here: Applying function to a DataFrame using its columns as parameters

which I liked a lot as a fan of lambda whatever... but it does not apply (at least I cannot see and please help me on this, possibly).

Would it be something like:

lambda x,col: x.loc[x.col==0], DataFrame, x.col)?

Thank you in advance

1
  • I don't understand problem - many examples use df["Height"] instead of df.Height - so it seems obvious x[col] for x=df and col="Height" Commented Feb 27 at 20:51

1 Answer 1

2

You can just use return x.loc[x[col] == 0] and so on which allows the use of a String variable. BTW this approach using [] is generally preferred for clarity and to avoid potential conflict with method names and also allows names with spaces (for which ‘dot’ approach cannot be used). For example:

import pandas as pd

df = pd.DataFrame({'name': ['irish1', 'irish2'],
                   'height': [1.8, 1.7]
                    })

def fun(x,col):
    return x.loc[x[col] == 1.7]


print(fun(df, 'height'))

gives

     name  height
1  irish2     1.7
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.