0

So I have the following pandas dataframe:

enter image description here

What I would like to do is create a new column that contains a unique list of all the dest_hostnames by the user_agent and user columns.

I also want another column that has the total count of events based on the useragent and user columns.

So the final dataset should look like:

enter image description here

I've been doing the following but can't figure out a way to do both so it's one in dataframe:

browsers.groupby(['user','user_agent'])['dest_hostname'].apply(list).reset_index(name='browser_hosts')

browsers.value_counts(["user", "user_agent"])

1 Answer 1

1

IIUC use agg

df.groupby(['user', 'user_agent'])['dest_hostname'].agg(['unique', 'count'])
Sign up to request clarification or add additional context in comments.

1 Comment

@HenryEcker, thanks, I missed the "contains a unique list" part of the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.