8

I'd like to apply a function with multiple returns to a pandas DataFrame and put the results in separate new columns in that DataFrame.

So given something like this:

import pandas as pd

df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})

def add_subtract(a, b):
  return (a + b, a - b)

The goal is a single command that calls add_subtract on a and b to create two new columns in df: sum and difference.

I thought something like this might work:

(df['sum'], df['difference']) = df.apply(
    lambda row: add_subtract(row['a'], row['b']), axis=1)

But it yields this error:

----> 9 lambda row: add_subtract(row['a'], row['b']), axis=1)

ValueError: too many values to unpack (expected 2)

EDIT: In addition to the below answers, pandas apply function that returns multiple values to rows in pandas dataframe shows that the function can be modified to return a list or Series, i.e.:

def add_subtract_list(a, b):
  return [a + b, a - b]

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_list(row['a'], row['b']), axis=1)

or

def add_subtract_series(a, b):
  return pd.Series((a + b, a - b))

df[['sum', 'difference']] = df.apply(
    lambda row: add_subtract_series(row['a'], row['b']), axis=1)

both work (the latter being equivalent to Wen's accepted answer).

0

2 Answers 2

9

Adding pd.Series

df[['sum', 'difference']] = df.apply(
    lambda row: pd.Series(add_subtract(row['a'], row['b'])), axis=1)
df

yields

   a  b  sum  difference
0  1  4    5          -3
1  2  5    7          -3
2  3  6    9          -3
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you! Can you explain why pd.Series is needed here?
@MaxGhenis You have tuple as result in your function , so , we pass tuple to pd.Series , this will reconstruct the column of tuple to two pd.Series (Dataframe), more info stackoverflow.com/questions/29550414/…
I wonder if row['a'] and row['b'] will actually work. Usually this kind of reference should not work inside of apply()
@FedericoDorato: You can use either row['a'] or even row.a inside apply as long as there's a lambda.
3

One way to do this would be to use pd.DataFrame.assign as follows:

df.assign(**{k:v for k,v in zip(['sum', 'difference'], add_subtract(df.a, df.b))})

Should yield:

   a  b  difference  sum
0  1  4          -3    5
1  2  5          -3    7
2  3  6          -3    9

Clarifications:

zip is a builtin function that returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. For instance, list(zip(['sum', 'difference'], [df.a + df.b], df.a - df.b)) should return [('sum', df.a + df.b), ('difference', df.a - df.b)].

** in front of a dictionary object serves as an operator that unpacks the combination of key and value pairs. In essence, the unpacking could be represented as something like this: sum=df.a + df.b, difference=df.a - df.b.

In sum, when combined, you get something like the following:

df.assign(sum=df.a + df.b, difference=df.a - df.b)

Follow the provided links to both zip and the ** operator in front of a dictionary object to get a better idea of how these useful tools work beyond this particular example.

3 Comments

This is intriguing: I'm relatively new to Python (mostly an R user), so could you explain what the ** and zip are doing here? Seems like a useful construct. I accepted Wen's answer as it differed least from my guess, but upvoted this and can change if this would be significantly better performance-wise.
@MaxGhenis you can treat the zip in python is list of list in R , in R we need unlist here is the example of R :-) (PS, I am 50% R user too :-) ) stackoverflow.com/questions/4227223/r-list-to-data-frame
That statement is a bit lacking. Data structures in R are not that easily translated in python data structures. The closest to zip I can think of in R is the transpose function from the purrr package. Even that doesn't really work the same way in all cases.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.