我有一个看起来像这样的Pandas数据框:
| PLAYER | DATE | SCORE | GAME |
|---------|------------|-------|------|
| Albert | 2020-08-12 | 10 | X |
| Barney | 2020-08-12 | 100 | X |
| Charlie | 2020-08-12 | 1000 | X |
| Albert | 2020-08-13 | 20 | X |
| Barney | 2020-08-13 | 200 | X |
| Charlie | 2020-08-13 | 2000 | X |
| Albert | 2020-08-14 | 30 | Y |
| Barney | 2020-08-14 | 300 | Y |
| Charlie | 2020-08-14 | 3000 | Y |
| Albert | 2020-08-15 | 40 | Y |
| Barney | 2020-08-15 | 400 | Y |
| Charlie | 2020-08-15 | 4000 | Y |
| Albert | 2020-08-16 | 50 | Z |
| Barney | 2020-08-16 | 500 | Z |
| Charlie | 2020-08-16 | 5000 | Z |
| Albert | 2020-08-17 | 60 | Z |
| Barney | 2020-08-17 | 600 | Z |
| Charlie | 2020-08-17 | 6000 | Z |
我正在尝试创建一个新列,将每个球员的2天平均得分作为一个子集,以便获得以下结果:
| PLAYER | DATE | SCORE | GAME | 2-DAY AVG |
|---------|------------|-------|------|-----------|
| Albert | 2020-08-12 | 10 | X | NaN |
| Barney | 2020-08-12 | 100 | X | NaN |
| Charlie | 2020-08-12 | 1000 | X | NaN |
| Albert | 2020-08-13 | 20 | X | 15 |
| Barney | 2020-08-13 | 200 | X | 150 |
| Charlie | 2020-08-13 | 2000 | X | 1500 |
| Albert | 2020-08-14 | 30 | Y | 25 |
| Barney | 2020-08-14 | 300 | Y | 250 |
| Charlie | 2020-08-14 | 3000 | Y | 2500 |
| Albert | 2020-08-15 | 40 | Y | 35 |
| Barney | 2020-08-15 | 400 | Y | 350 |
| Charlie | 2020-08-15 | 4000 | Y | 3500 |
| Albert | 2020-08-16 | 50 | Z | 45 |
| Barney | 2020-08-16 | 500 | Z | 450 |
| Charlie | 2020-08-16 | 5000 | Z | 4500 |
| Albert | 2020-08-17 | 60 | Z | 55 |
| Barney | 2020-08-17 | 600 | Z | 550 |
| Charlie | 2020-08-17 | 6000 | Z | 5500 |
我已经搜索了堆栈溢出,并尝试了groupby()
与rolling.mean(2)
函数一起使用的几种代码组合以及python条件语句,但未能成功。
在熊猫中有聪明的方法吗?
这应该做您想要的:
df['2-DAY AVG'] = df.groupby('PLAYER').SCORE.apply(lambda x: x.rolling(2).mean())
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句