我有一个数据框:
import pandas as pd
data = pd.DataFrame({'start':['2020-08-01','2020-08-02','2020-08-03','2020-08-04','2020-08-05','2020-08-06','2020-08-07','2020-08-08'],
'end':['2020-08-03','2020-08-03','2020-08-06','2020-08-06','2020-08-06','2020-08-08','2020-08-08','2020-08-08'],
'score':[74, 81, 38, 49, 79, 17, 53, 69]})
我需要计算日期与其对应日期score
之间的差异为:start
end
start end score result
0 2020-08-01 2020-08-03 74 36 # 74-38 as score on 08/03 is 38
1 2020-08-02 2020-08-03 81 43 # 81-38
2 2020-08-03 2020-08-06 38 21 # 38-17 as score on 08/06 is 17
3 2020-08-04 2020-08-06 49 32 # 49-17
4 2020-08-05 2020-08-06 79 62 # 79-17
5 2020-08-06 2020-08-08 17 -52 # 17-69 as score on 08/08 is 69
6 2020-08-07 2020-08-08 53 -16 # 53-69
7 2020-08-08 2020-08-08 69 0 # 69-69
有什么好pandas
方法吗?非常感谢!
如果所有start
值都是唯一的,请使用映射值减去:
data['result'] = data['score'].sub(data['end'].map(data.set_index('start')['score']))
print (data)
start end score result
0 2020-08-01 2020-08-03 74 36
1 2020-08-02 2020-08-03 81 43
2 2020-08-03 2020-08-06 38 21
3 2020-08-04 2020-08-06 49 32
4 2020-08-05 2020-08-06 79 62
5 2020-08-06 2020-08-08 17 -52
6 2020-08-07 2020-08-08 53 -16
7 2020-08-08 2020-08-08 69 0
详细说明:
print (data['end'].map(data.set_index('start')['score']))
0 38
1 38
2 17
3 17
4 17
5 69
6 69
7 69
Name: end, dtype: int64
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句