How to assign a value_count output to a dataframe

Question

I am trying to assign the output from a value_count to a new df. My code follows.

import pandas as pd
import glob


df = pd.concat((pd.read_csv(f, names=['date','bill_id','sponsor_id']) for f in glob.glob('/home/jayaramdas/anaconda3/df/s11?_s_b')))


column_list = ['date', 'bill_id']

df = df.set_index(column_list, drop = True)
df = df['sponsor_id'].value_counts()

df.columns=['sponsor', 'num_bills']
print (df)

The value count is not being assigned the column headers specified 'sponsor', 'num_bills'. I'm getting the following output from print.head

1036    426
791     408
1332    401
1828    388
136     335
Name: sponsor_id, dtype: int64

df = df['sponsor_id'].value_counts() didn't you drop sponsor_id? — Deusdeorum
– Deusdeorum, Commented Mar 9, 2016 at 13:40
value_counts produces a Series so there is only a single column, you need to reset_index and then overwrite the columns, see my answer — EdChum
– EdChum, Commented Mar 9, 2016 at 13:43

EdChum · Accepted Answer · 2016-03-09 13:40:46Z

12

your column length doesn't match, you read 3 columns from the csv and then set the index to 2 of them, you calculated value_counts which produces a Series with the column values as the index and the value_counts as the values, you need to reset_index and then overwrite the column names:

df = df.reset_index()
df.columns=['sponsor', 'num_bills']

Example:

In [276]:
df = pd.DataFrame({'col_name':['a','a','a','b','b']})
df

Out[276]:
  col_name
0        a
1        a
2        a
3        b
4        b

In [277]:
df['col_name'].value_counts()

Out[277]:
a    3
b    2
Name: col_name, dtype: int64

In [278]:    
type(df['col_name'].value_counts())

Out[278]:
pandas.core.series.Series

In [279]:
df = df['col_name'].value_counts().reset_index()
df.columns = ['col_name', 'count']
df

Out[279]:
  col_name  count
0        a      3
1        b      2

answered Mar 9, 2016 at 13:40

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BSalita · Accepted Answer · 2021-05-10 09:46:18Z

0

Appending value_counts() to multi-column dataframe:

df = pd.DataFrame({'C1':['A','B','A'],'C2':['A','B','A']})
vc_df = df.value_counts().to_frame('Count').reset_index()
display(df, vc_df)

    C1  C2
0   A   A
1   B   B
2   A   A

    C1   C2 Count
0   A   A   2
1   B   B   1

answered May 10, 2021 at 9:46

BSalita

9,07111 gold badges59 silver badges75 bronze badges

Collectives™ on Stack Overflow

How to assign a value_count output to a dataframe

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related