11

I'm trying to find out a way how I can select rows in pandas dataframe based that some values will be in my list. For example

df = pd.DataFrame(np.arange(6).reshape(3,2), columns=['A','B'])
   A  B
0  0  1
1  2  3
2  4  5

I know that I can select certain row, e.g.

df[df.A==0]

will select me row with A=0. What I want is to select multiple rows whose values will be in my list, e.g. A in [0,2]. I tried

df[df.A in [0,2]]
df[list(df.A)==[0,2]]

but nothing works. In R language I can provide %in% operator. In python syntax we can use A in [0,2], etc. How I can select subset of rows in pandas in this case? Thanks, Valentin.

0

2 Answers 2

30

pd.isin() will select multiple values:

>>> df[df.A.isin([0,2])]
   A  B
0  0  1
1  2  3
Sign up to request clarification or add additional context in comments.

8 Comments

Brian, thanks that works. How about negation operation, i.e, not in?
You can use numpy's logical_not: df[np.logical_not(df.A.isin([0,2]))]
Great, that's answer completely my question.
Is there a way to use pandas df.loc , iloc or ix to do this?
I don't think so, unless you are 'cheating' by knowing the which rows you are looking for. (In this example, df.iloc[0:2] (1st and 2nd rows) and df.loc[0:1] (rows with index value in the range of 0-1 (the index being unlabeled column on the left) both give you the equivalent output, but you had to know in advance. If you want a different syntax, there is a df.query() method.
|
3

if you don't like that syntax, you can use also use query (introduced in pandas 0.13 which is from 2014):

>>> df.query('A in [0,2]')
   A  B
0  0  1
1  2  3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.