10

I have a DataFrame and I want to get both group names and corresponding group counts as a list or numpy array. However when I convert the output to matrix I only get group counts I dont get the names. Like in the example below:

  df = pd.DataFrame({'a':[0.5, 0.4, 5 , 0.4, 0.5, 0.6 ]})
  b = df['a'].value_counts()
  print(b)

output:

[0.4    2
0.5    2
0.6    1
5.0    1
Name: a, dtype: int64]

what I tried is print[b.as_matrix()]. Output:

[array([2, 2, 1, 1])]

In this case I do not have the information of corresponding group names which also I need. Thank you.

0

3 Answers 3

11

Convert it to a dict:

bd = dict(b)
print(bd)
# {0.40000000000000002: 2, 0.5: 2, 0.59999999999999998: 1, 5.0: 1}

Don't worry about the long decimals. They're just a result of floating point representation; you still get what you expect from the dict.

bd[0.4]
# 2
Sign up to request clarification or add additional context in comments.

Comments

4

most simplest way

list(df['a'].value_counts())

Comments

2

One approach with np.unique -

np.c_[np.unique(df.a, return_counts=1)]

Sample run -

In [270]: df
Out[270]: 
     a
0  0.5
1  0.4
2  5.0
3  0.4
4  0.5
5  0.6

In [271]: np.c_[np.unique(df.a, return_counts=1)]
Out[271]: 
array([[ 0.4,  2. ],
       [ 0.5,  2. ],
       [ 0.6,  1. ],
       [ 5. ,  1. ]])

We can zip the outputs from np.unique for list output -

In [283]: zip(*np.unique(df.a, return_counts=1))
Out[283]: [(0.40000000000000002, 2), (0.5, 2), (0.59999999999999998, 1), (5.0, 1)]

Or use zip directly on the value_counts() output -

In [338]: b = df['a'].value_counts()

In [339]: zip(b.index, b.values)
Out[339]: [(0.40000000000000002, 2), (0.5, 2), (0.59999999999999998, 1), (5.0, 1)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.