0

I have a data which looks like below

data = [(datetime.datetime(2021, 2, 10, 7, 49, 7, 118658), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 49, 14, 312273), u'12.100.90.10', u'100.100.12.1', u'100.100.12.1', u'LT_UP'),
       (datetime.datetime(2021, 2, 10, 7, 49, 21, 535932), u'12.100.90.10', u'100.100.12.1', u'100.100.22.1', u'LT_UP'),
       (datetime.datetime(2021, 2, 10, 7, 50, 28, 264042), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'LT_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 50, 28, 725961), u'12.100.90.10', u'100.100.12.1', u'100.100.32.1', u'PL_DOWN'),
       (datetime.datetime(2021, 2, 10, 7, 50, 32, 450853), u'10.100.80.10', u'10.55.10.1', u'100.100.12.1', u'PL_LOW'),
       (datetime.datetime(2021, 2, 10, 7, 51, 32, 450853), u'10.10.80.10', u'10.55.10.1', u'100.100.12.1', u'MA_HIGH'),
       (datetime.datetime(2021, 2, 10, 7, 52, 34, 264042), u'10.10.80.10', u'10.55.10.1', u'10.55.10.1', u'PL_DOWN'),
]

This is how it looks on loading in pandas

df = pd.DataFrame(data)
df.columns = ["date", "start", "end", "end2", "type"]
# drop duplicate rows
df = df.drop_duplicates()

                        date         start           end          end2     type
0 2021-02-10 07:49:07.118658  12.100.90.10  100.100.12.1  100.100.12.1  LT_DOWN
1 2021-02-10 07:49:14.312273  12.100.90.10  100.100.12.1  100.100.12.1    LT_UP
2 2021-02-10 07:49:21.535932  12.100.90.10  100.100.12.1  100.100.22.1    LT_UP
3 2021-02-10 07:50:28.264042  12.100.90.10  100.100.12.1  100.100.32.1  LT_DOWN
4 2021-02-10 07:50:28.725961  12.100.90.10  100.100.12.1  100.100.32.1  PL_DOWN
5 2021-02-10 07:50:32.450853  10.100.80.10    10.55.10.1  100.100.12.1   PL_LOW
6 2021-02-10 07:51:32.450853   10.10.80.10    10.55.10.1  100.100.12.1  MA_HIGH
7 2021-02-10 07:52:34.264042   10.10.80.10    10.55.10.1   100.55.10.1  PL_DOWN

Now I only want to select rows that have end and end2 columns containing same values. So my output would be

                        date         start           end          end2     type
0 2021-02-10 07:49:07.118658  12.100.90.10  100.100.12.1  100.100.12.1  LT_DOWN
1 2021-02-10 07:49:14.312273  12.100.90.10  100.100.12.1  100.100.12.1    LT_UP
2 2021-02-10 07:52:34.264042   10.10.80.10    10.55.10.1    10.55.10.1  PL_DOWN

Now according to this question on stackoverflow Get rows that have the same value across its columns in pandas I could do this to check for similar values across all columns.

df[df.apply(pd.Series.nunique, axis=1) == 1]

But for my case I want this check limited to certain columns only.

How do I do this?

2 Answers 2

2

Just use masking.

df[df.end == df.end2]
Sign up to request clarification or add additional context in comments.

Comments

1
df = df.loc[(df['end'] == df['end2'])]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.