Here's the dataframe:
Code: Select all
df = pd.DataFrame({'A': ['foo', 'bar', 'baz', 'foo'],
'B': ['qux', 'quux', 'quuz', 'xyz']})
df
First: here's what works: I write the functions directly and they work. Col 2 and 3 both have the correct boolean values; based on the search conditions, only the first row returns True, all the other rows are False.
Code: Select all
df['col2'] = (df['A'] == 'foo') & (df['B'] == 'qux' )
df = df.assign(col3 = lambda x: (df['A'] == 'foo') & (df['B'] == 'qux' ))
Code: Select all
def test(df):
if ( (df['A'] == 'foo') & (df['B'] == 'qux' ) ).any():
x = True
else:
x = False
return x
Code: Select all
df = df.assign(col4 = lambda x: test(df))
# df['col5'] = df.apply(test, axis=1)
df['col6'] = df.apply(lambda x: test(df), axis=1)
As you can see from the output, the function spits out True for all rows - clearly wrong. I suspected that maybe the "any()" operator is creating some problems.....but if I remove that...I get the "...truth value of a series is ambiguous..." error.
Would appreciate any suggestions!!