Filter a column which has multilevel column header in Pandas

Question

I have an excel like below

A   B   C   
        x   y
1   abc 3   5
2   abc 4   6
1   def 5   7
1   qrs 6   8
2   def 7   9

When I read this with pandas using the code df= pd.read_excel(r'C:\Users\mys2kor\Desktop\Mappe1.xlsx',engine='openpyxl',header=[0,1])

I get the dataframe like so

A   B   C
Unnamed: 0_level_1  Unnamed: 1_level_1  x   y
1   abc     3   5
2   abc     4   6
1   def     5   7
1   qrs     6   8
2   def     7   9

I want to filter Column a for 1 and column B for 'def' & get just one row as output.

When I do df["A"].filter(like="Unnamed")==1 and df["B"].filter(like="Unnamed")=='def' separately, I get the proper boolean values

Unnamed: 0_level_1
True
False
True
True
False

and

Unnamed: 1_level_1
False
False
True
False
True

But when I put them together like so (df["A"].filter(like="Unnamed")==1) & (df["B"].filter(like="Unnamed")=='def'), I get like below!

Unnamed: 0_level_1  Unnamed: 1_level_1
False   False
False   False
False   False
False   False
False   False

How do I filter such that the output is

A   B   C   
        x   y
1   def 5   7

Shubham Sharma · Accepted Answer · 2022-06-17 06:43:01Z

1

It looks like you could use the columns A and B as index while reading the excel file, then use indexing with loc to query the index values:

df = pd.read_excel('...', header=[0, 1], index_col=[0, 1])
df.loc[[(1, 'def')]]

answered Jun 17, 2022 at 6:43

Shubham Sharma

71.8k6 gold badges26 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Is there a way to do this by columns names? I ask because sometimes, we do not know the column index, we know just the column name.

@moys Aa far as I know, you can not specify the names for index while reading the excel file although this is possible while reading the csv file.