Chaining groupby and apply pandas

debugcn 投稿 Dev

Mamene

I'm looking for a way to chain groupby and apply, like this (cf code below for a concrete example):

df.groupby("a").apply(func_1).groupby("b").apply(func_2)

I guess it doesn't work because groupby needs to take in input a dataframe, which is not always the case of the 2nd groupby above (could take in input a serie, cf example). A solution could be to have the first apply which outputs the result of func_1 plus the original dataframe, but I haven't found how to do this.

I'm looking for a general workaround, not just a workaround for this specific example.

Example: Let's say that I want to compute the area under curb of a for each group in b and then compute the sum of these areas for each group in c.

df=pd.DataFrame({"a":np.arange(8),"b":np.repeat(np.arange(4),2),
"c":np.repeat(np.arange(2),4)})

df
   a  b  c
0  0  0  0
1  1  0  0
2  2  1  0
3  3  1  0
4  4  2  1
5  5  2  1
6  6  3  1
7  7  3  1


df.groupby("b").apply(lambda x: trapz(x["a"])).groupby("c").apply(sum)   
Traceback (most recent call last):
[...]
KeyError: 'c'


#Expected output
c
0     3.0
1    11.0


#I know that this code works, but I would like to avoid to modify 
#my dataframe :

df["result"]=list(df
    .groupby("b").apply(lambda x: trapz(x["a"]))
    .repeat(df.groupby("b").size()))
df.groupby("b").first().groupby("c").result.sum()

Any help greatly appreciated!

YOLO

I think I would do something like:

# your_fun is the function you want to apply
df.groupby('c').apply(lambda f: sum(f.groupby('b')['a'].apply(your_fun))

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-06-3

コメントを追加

サインイン

分類Dev

pandas groupby apply is really slow

分類Dev

pandas groupby.apply to pyspark

分類Dev

Apply multiple if/else statement to groupby object in pandas

分類Dev

lodash chaining groupBy

分類Dev

Can't add a column by using pandas groupby.apply

分類Dev

pandas groupby create a new dataframe with label from apply operation

分類Dev

Groupby MultiIndex and apply dot product to each group in pandas

分類Dev

apply a function to a groupby function

分類Dev

引数付きでPandas groupby（）+ apply（）を使用する

分類Dev

Pandas.groupby.apply（）のメモリリーク？

分類Dev

PANDAS: How to access keys of groupby object when attempting to apply multiple functions

分類Dev

How to use groupby().apply() instead of running loop on whole dataset in Python Pandas?

分類Dev

Pandas Date Groupby＆Apply-パフォーマンスの向上

分類Dev

Pandas groupby chaining：マルチインデックス列の名前を1行の列に変更

分類Dev

Python Pandas、.groupby（）。apply（）のグループから行をスライスします

分類Dev

Pandas groupby / applyは、int型とstring型で異なる動作をします

分類Dev

pandas.DataFrame.groupby.apply（）の後に列の名前を変更します

分類Dev

python pandas groupby / apply：apply関数に正確に渡されるものは何ですか？

分類Dev

Pandas Groupby＆Pivot

分類Dev

Pandas Groupby＆Pivot

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Pandas Multiindex Groupby on Columns

分類Dev

Groupby in pandas dataframe

分類Dev

GroupByとCutin Pandas

分類Dev

pandas groupby rolling behaviour

分類Dev

Plot Pandas groupby dataframe

分類Dev

Stackplot pandas groupby

分類Dev

how to groupby and filter in pandas

分類Dev

Pandas conditional groupby

Related 関連記事

記事