5

I'm trying to do a value_count for a specific column in my dataframe

For example:

    <Fruit>
0   'apple'
1   'apple, orange'
2   'orange'

How do I sum it so it will count it even if it is in a list? So the above should give me:

'Apple'   2
'Orange'  2

I tried turning the string into a list, but not sure how to value_count over fields with a list of values.

2 Answers 2

6

This is a pandonic way

In [8]: s
Out[8]: 
0            apple
1    apple, orange
2           orange
dtype: object

Split the strings by their separators, turn them into Series and count them.

In [9]: s.str.split(',\s+').apply(lambda x: Series(x).value_counts()).sum()
Out[9]: 
apple     2
orange    2
dtype: float64
Sign up to request clarification or add additional context in comments.

1 Comment

great use of pandonic
0

This is your dataframe:

df = p.DataFrame(['apple', 'apple, orange', 'orange'], columns= ['fruit'])

Then just join all your entries in the fruit column with a comma, eliminate extra spaces, and split again to have a list with all your fruits. Finally count them:

>>> from collections import Counter
>>> Counter(','.join(df['fruit']).replace(' ', '').split(','))

Counter({'orange': 2, 'apple': 2})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.