how to drop rows with 'nan' in a column in a pandas dataframe?

Question

I have a dataframe (denoted as 'df') where some values are missing in a column (denoted as 'col1').

I applied a set function to find unique values in the column:

print(set(df['col1']))

Output:
{0.0, 1.0, 2.0, 3.0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan}

I am trying to drop these 'nan' rows from the dataframe where I have tried this:

df['col1'] = df['col1'].dropna()

However, the column rows remain unchanged.

I'm thinking that the above repeated 'nan' values in the above set may not be normal behaviour.

Any suggestions on how to remove these values?

Marko Knöbl · Accepted Answer · 2022-06-10 13:46:04Z

5

I think what you're doing is taking one column from a DataFrame, removing all the NaNs from it, but then adding that column to the same DataFrame again - where any missing values from the index will be filled by NaNs again.

Do you want to remove that row from the entire DataFrame? If yes, try df.dropna(subset=["col1"])

answered Jun 10, 2022 at 13:46

Marko Knöbl

5422 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ynjxsjmh · Accepted Answer · 2022-06-10 14:14:38Z

4

Marko Knöbl explains it well, problem is that you assign the dropped Series back, you can also try

df = df[df['col1'].notna()]

answered Jun 10, 2022 at 14:14

Ynjxsjmh

30.3k7 gold badges43 silver badges64 bronze badges

Collectives™ on Stack Overflow

how to drop rows with 'nan' in a column in a pandas dataframe?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related