1

The following Python code works fine

import pandas as pd

df = pd.DataFrame(data = {'a': [1, 2, 3], 'b': [4, 5, 6]})

def myfun(a, b):
  return [a + b, a - b]

df[['x', 'y']] = df.apply(
    lambda row: myfun(row.a, row.b), axis=1)

The resulting pandas dataframe looks like:

print(df)

   a  b  x  y
0  1  4  5 -3
1  2  5  7 -3
2  3  6  9 -3

However, if I try to add two more columns,

df[['xx','yy']] = df.apply(lambda row: myfun(row.a, row.b), axis=1)

I get the error message,

KeyError: "['xx' 'yy'] not in index"

How come? And what is the correct way to do this?

Many thanks!

//A

2 Answers 2

2

Need convert return output to Series:

def myfun(a, b):
  return pd.Series([a + b, a - b])

df[['x', 'y']] = df.apply(
    lambda row: myfun(row.a, row.b), axis=1)
print (df)
   a  b  x  y
0  1  4  5 -3
1  2  5  7 -3
2  3  6  9 -3
Sign up to request clarification or add additional context in comments.

Comments

0

You can assign to a tuple of series:

df['xx'], df['yy'] = df.apply(lambda row: myfun(row.a, row.b), axis=1)

But this is inefficient versus direct assignment: don't use pd.DataFrame.apply unless you absolutely must, it's just a fancy loop.

df['xx'] = df['a'] + df['b']
df['yy'] = df['a'] - df['b']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.