Deleting rows based on multiple conditions in a pandas dataframe

Question

I want to delete rows when a few conditions are met:

An example dataframe is shown below:

        one       two     three      four
0 -0.225730 -1.376075  0.187749  0.763307
1  0.031392  0.752496 -1.504769 -1.247581
2 -0.442992 -0.323782 -0.710859 -0.502574
3 -0.948055 -0.224910 -1.337001  3.328741
4  1.879985 -0.968238  1.229118 -1.044477
5  0.440025 -0.809856 -0.336522  0.787792
6  1.499040  0.195022  0.387194  0.952725
7 -0.923592 -1.394025 -0.623201 -0.738013
8 -1.775043 -1.279997  0.194206 -1.176260
9 -0.602815  1.183396 -2.712422 -0.377118

I want to delete rows based on the conditions that:

Row with value of col 'one', 'two', or 'three' greater than 0; and value of col 'four' less than 0 should be deleted.

Then I tried to implement as follows:

df = df[df.one > 0 or df.two > 0 or df.three > 0 and df.four < 1]

However, it results in a error message as follows:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Could someone help me on how to delete based on multiple conditions?

Brionius · Accepted Answer · 2015-03-12 19:11:54Z

54

For reasons that aren't 100% clear to me, pandas plays nice with the bitwise logical operators | and &, but not the boolean ones or and and.

Try this instead:

df = df[(df.one > 0) | (df.two > 0) | (df.three > 0) & (df.four < 1)]

edited Mar 12, 2015 at 19:11

answered Mar 12, 2015 at 18:37

Brionius

14.2k3 gold badges41 silver badges50 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

EdChum Over a year ago

You want df = df[((df.one > 0) | (df.two > 0) | (df.three > 0)) & (df.four < 1)] as to why it's because it's ambiguous to compare arrays as there are potentially multiple matches see this: stackoverflow.com/questions/10062954/…

Brionius Over a year ago

Oh, whoops, didn't see the and at the end. Edited.

DSM Over a year ago

@Brionius: it's basically because or and and can't have their behaviour customized by a class. They do what they do based on the result of bool(the_object), and that's it.

zelusp Over a year ago

To delete, say, any row with a string that contains 1 of 20 possible subkeys, look here

cottontail · Accepted Answer · 2023-03-27 04:58:56Z

`drop` could be used to drop rows

The most obvious way is to constructing a boolean mask given the condition, filter the index by it to get an array of indices to drop and drop these indices using drop(). If the condition is:

Row with value of col 'one', 'two', or 'three' greater than 0; and value of col 'four' less than 0 should be deleted.

then the following works.

msk = (df['one'].gt(0) | df['two'].gt(0) | df['three'].gt(0)) & df['four'].lt(0)
idx_to_drop = df.index[msk]
df1 = df.drop(idx_to_drop)

The first part of the condition, i.e. col 'one', 'two', or 'three' greater than 0 can be written a little concisely with .any(axis=1):

msk = df[['one', 'two', 'three']].gt(0).any(axis=1) & df['four'].lt(0)

Keep the complement of the rows to drop

Deleting/removing/dropping rows is the inverse of keeping rows. So another way to do this task is to negate (~) the boolean mask for dropping rows and filter the dataframe by it.

msk = df[['one', 'two', 'three']].gt(0).any(axis=1) & df['four'].lt(0)
df1 = df[~msk]

`query()` the rows to keep

pd.DataFrame.query() is a pretty readable API for filtering rows to keep. It also "understands" and/or etc. So the following works.

# negate the condition to drop
df1 = df.query("not ((one > 0 or two > 0 or three > 0) and four < 0)")

# the same condition transformed using de Morgan's laws
df1 = df.query("one <= 0 and two <= 0 and three <= 0 or four >= 0")

All of the above perform the following transformation:

Collectives™ on Stack Overflow

Deleting rows based on multiple conditions in a pandas dataframe

2 Answers 2

4 Comments

`drop` could be used to drop rows

Keep the complement of the rows to drop

`query()` the rows to keep

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

drop could be used to drop rows

Keep the complement of the rows to drop

query() the rows to keep

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

`drop` could be used to drop rows

`query()` the rows to keep