每秒的交通数据显示进出的汽车数量。我想按In / Out将它们汇总到2分钟内,并显示其总计,例如:
import pandas as pd
data = {'time': ["13:34:16","13:34:19","13:34:52","13:34:55","13:34:58","13:35:01","13:35:04","13:35:37","13:35:40","13:35:43","13:36:37","13:36:39","13:36:43","13:36:46","13:36:49","13:36:52","13:36:58","13:37:04","13:37:07","13:37:13","13:37:46","13:37:49","13:37:58",],
'cars' : [15,22,12,1,331,32,14,5,51,13,3,22,5,2,4,1,3,5,89,105,1,63,1,],
'flow': ["In","Out","In","Unknown","Out","In","Out","Unknown","Out","Out","In","In","Unknown","In","In","Out","In","In","In","In","In","In","In",]}
我试过了:
df = pd.DataFrame(data)
df.time = '2020-01-23 ' + df.time # data date
df.time = pd.to_datetime(df.time, unit='s')
print (df.groupby('flow').resample('2T')['cars'].sum())
但是它给出了错误:
ValueError: non convertible value 2020-01-23 13:34:16 with the unit 's'
在正确的方法上有什么帮助吗?谢谢。
我相信您应该对索引重新采样。你能试一下吗:
df.time = pd.to_datetime(df.time)
df.set_index("time").groupby('flow').resample('2T')['cars'].sum()
flow time
In 2020-01-23 13:34:00 59
2020-01-23 13:36:00 298
Out 2020-01-23 13:34:00 431
2020-01-23 13:36:00 1
Unknown 2020-01-23 13:34:00 6
2020-01-23 13:36:00 5
Name: cars, dtype: int64
还有,如果您想复制自己的excel:
df_new = df_new.unstack().T
df_new["Total"] =df_new.sum(axis=1)
print(df_new)
flow In Out Unknown Total
time
2020-01-23 13:34:00 59 431 6 496
2020-01-23 13:36:00 298 1 5 304
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句