I'm subclassing pandas DataFrame in a project of mine. Most pandas operations preserve the subclass type, but df.groupby().agg() does not. Is this a bug? Is there a known workaround?
import pandas as pd
class MySeries(pd.Series):
pass
class MyDataFrame(pd.DataFrame):
@property
def _constructor(self):
return MyDataFrame
_constructor_sliced = MySeries
MySeries._constructor_expanddim = MyDataFrame
df = MyDataFrame({"a": reversed(range(10)), "b": list('aaaabbbccc')})
print(type(df.groupby("b").sum()))
# <class '__main__.MyDataFrame'>
print(type(df.groupby("b").agg({"a": "sum"})))
# <class 'pandas.core.frame.DataFrame'>
It looks like there was an issue (described here) that fixed subclassing for df.groupby, but as far as I can tell df.groupby().agg() was missed. I'm using pandas version 2.0.3.