I have a dataframe with column list
import pandas as pd
data_dict = {"Trace" : [["A-M", "B&M", "B&Q", "BLOG", "BYPAS", "CIM"],
["B&M", "B&Q", "BLOG", "BYPAS"],
["BLOG", "BYPAS", "CIM"],
["A-M", "B&M", "B&Q", "BLOG"],
["A-M", "B&M", "B&Q", "BLOG", "BYPAS", "CIM"],
["A-M", "B&M", "B&Q", "BLOG", "BYPAS", "CIM"],
["BLOG", "BYPAS", "CIM"],
["BLOG", "BYPAS", "CIM"],
["BLOG", "BYPAS", "CIM"]]}
data = pd.DataFrame(data_dict)
Trace
0 [A-M, B&M, B&Q, BLOG, BYPAS, CIM]
1 [B&M, B&Q, BLOG, BYPAS]
2 [BLOG, BYPAS, CIM]
3 [A-M, B&M, B&Q, BLOG]
4 [A-M, B&M, B&Q, BLOG, BYPAS, CIM]
5 [A-M, B&M, B&Q, BLOG, BYPAS, CIM]
6 [BLOG, BYPAS, CIM]
7 [BLOG, BYPAS, CIM]
8 [BLOG, BYPAS, CIM]
Is there a way to get the unique count of lists in the column, like value_counts(normalize=True) for hashable values in pandas?
Trace Count Percentage
0 [A-M, B&M, B&Q, BLOG, BYPAS, CIM]
1 [B&M, B&Q, BLOG, BYPAS]
2 [BLOG, BYPAS, CIM]
3 [A-M, B&M, B&Q, BLOG]
df['Trace'].apply(tuple).value_counts()should do it. You have to make your list intotuplewhich is immutable and hashable.