1

I have an excel like below

enter image description here

A   B   C   
        x   y
1   abc 3   5
2   abc 4   6
1   def 5   7
1   qrs 6   8
2   def 7   9

When I read this with pandas using the code df= pd.read_excel(r'C:\Users\mys2kor\Desktop\Mappe1.xlsx',engine='openpyxl',header=[0,1])

I get the dataframe like so

A   B   C
Unnamed: 0_level_1  Unnamed: 1_level_1  x   y
1   abc     3   5
2   abc     4   6
1   def     5   7
1   qrs     6   8
2   def     7   9

enter image description here

I want to filter Column a for 1 and column B for 'def' & get just one row as output.

When I do df["A"].filter(like="Unnamed")==1 and df["B"].filter(like="Unnamed")=='def' separately, I get the proper boolean values

Unnamed: 0_level_1
True
False
True
True
False

and

Unnamed: 1_level_1
False
False
True
False
True

But when I put them together like so (df["A"].filter(like="Unnamed")==1) & (df["B"].filter(like="Unnamed")=='def'), I get like below!

Unnamed: 0_level_1  Unnamed: 1_level_1
False   False
False   False
False   False
False   False
False   False

How do I filter such that the output is

A   B   C   
        x   y
1   def 5   7

1 Answer 1

1

It looks like you could use the columns A and B as index while reading the excel file, then use indexing with loc to query the index values:

df = pd.read_excel('...', header=[0, 1], index_col=[0, 1])
df.loc[[(1, 'def')]]
Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way to do this by columns names? I ask because sometimes, we do not know the column index, we know just the column name.
@moys Aa far as I know, you can not specify the names for index while reading the excel file although this is possible while reading the csv file.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.