您可以使用:
df = pd.DataFrame({
'Event':list('abc'),
'StartTime':['24-12-19 1:14','22-12-19 0:32','23-12-19 6:00'],
'EndTime':['24-12-19 6:00','24-12-19 4:32','24-12-19 16:00']
})
df[['StartTime','EndTime']] = df[['StartTime','EndTime']].apply(pd.to_datetime, dayfirst=True)
df1 = (df.melt('Event')
.set_index('value')
.groupby('Event')['Event']
.resample('H')
.count()
.reset_index(name='val')
.assign(val=1,
date=lambda x: x['value'].dt.date,
hour=lambda x: x['value'].dt.hour)
.set_index(['Event','date','hour'])['val']
.unstack(fill_value=0)
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
Event date 0 1 2 3 4 5 6 7 ... 14 15 16 17 18 19 20 \
0 a 2019-12-24 0 1 1 1 1 1 1 0 ... 0 0 0 0 0 0 0
1 b 2019-12-22 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1
2 b 2019-12-23 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1
3 b 2019-12-24 1 1 1 1 1 0 0 0 ... 0 0 0 0 0 0 0
4 c 2019-12-23 0 0 0 0 0 0 1 1 ... 1 1 1 1 1 1 1
5 c 2019-12-24 1 1 1 1 1 1 1 1 ... 1 1 1 0 0 0 0
21 22 23
0 0 0 0
1 1 1 1
2 1 1 1
3 0 0 0
4 1 1 1
5 0 0 0
[6 rows x 26 columns]
说明:
DataFrame.apply
和将两列转换为日期时间to_datetime
DataFrame.melt
-DataFrameGroupBy.resample
每组尽可能DataFrame.assign
将的所有值设置val
为1
,日期设置为Series.dt.date
和Series.dt.hour
DataFrame.set_index
和Series.unstack
DataFrame.reset_index
和清理了一些数据DataFrame.rename_axis
编辑:
对于小时的开始和结束,请使用类似的解决方案-对于小时,请减去下限小时Series.dt.floor
,如果还减去开始日期1
,则可以first
使用resample
:
#changed times
df = pd.DataFrame({
'Event':list('abc'),
'StartTime':['24-12-19 1:20','22-12-19 0:30','23-12-19 6:00'],
'EndTime':['24-12-19 6:20','24-12-19 4:40','24-12-19 16:00']
})
df[['StartTime','EndTime']] = df[['StartTime','EndTime']].apply(pd.to_datetime, dayfirst=True)
f = lambda x: x['value'].sub(x['value'].dt.floor('H')).dt.total_seconds().div(3600)
df1 = (df.melt('Event')
.assign(h = f)
.assign(h = lambda x: x.h.mask(x.variable == 'StartTime', 1 - x.h))
.set_index('value')
.groupby('Event')['h']
.resample('H')
.first()
.fillna(1)
.reset_index(name='h')
.assign(date=lambda x: x['value'].dt.date,
hour=lambda x: x['value'].dt.hour)
.set_index(['Event','date','hour'])['h']
.unstack(fill_value=0)
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
Event date 0 1 2 3 4 5 6 7 \
0 a 2019-12-24 0.0 0.666667 1.0 1.0 1.000000 1.0 0.333333 0.0
1 b 2019-12-22 0.5 1.000000 1.0 1.0 1.000000 1.0 1.000000 1.0
2 b 2019-12-23 1.0 1.000000 1.0 1.0 1.000000 1.0 1.000000 1.0
3 b 2019-12-24 1.0 1.000000 1.0 1.0 0.666667 0.0 0.000000 0.0
4 c 2019-12-23 0.0 0.000000 0.0 0.0 0.000000 0.0 1.000000 1.0
5 c 2019-12-24 1.0 1.000000 1.0 1.0 1.000000 1.0 1.000000 1.0
14 15 16 17 18 19 20 21 22 23
0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
2 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
3 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 ... 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
5 ... 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[6 rows x 26 columns]
编辑1:想法是按分钟重新采样,然后合计小时:
df = pd.DataFrame({
'Event':list('abc'),
'StartTime':['20-12-19 18:06','22-12-19 0:32','23-12-19 6:00'],
'EndTime':['20-12-19 18:07','24-12-19 4:32','24-12-19 16:00']
})
df[['StartTime','EndTime']] = df[['StartTime','EndTime']].apply(pd.to_datetime, dayfirst=True)
f = lambda x: x['value'].sub(x['value'].dt.floor('Min')).dt.total_seconds().div(60)
df1 = (df.melt('Event')
.assign(h = f)
.assign(h = lambda x: x.h.mask(x.variable == 'StartTime', 1 - x.h))
.set_index('value')
.groupby('Event')['h']
.resample('Min')
.first()
.fillna(1)
.reset_index(name='h')
.assign(date=lambda x: x['value'].dt.date,
hour=lambda x: x['value'].dt.hour)
.groupby(['Event','date','hour'])['h']
.sum()
.unstack(fill_value=0)
.div(60)
.reset_index()
.rename_axis(None, axis=1)
)
print (df1)
Event date 0 1 2 3 4 5 6 7 8 \
0 a 2019-12-20 0.000000 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0
1 b 2019-12-22 0.466667 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
2 b 2019-12-23 1.000000 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
3 b 2019-12-24 1.000000 1.0 1.0 1.0 0.533333 0.0 0.0 0.0 0.0
4 c 2019-12-23 0.000000 0.0 0.0 0.0 0.000000 0.0 1.0 1.0 1.0
5 c 2019-12-24 1.000000 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
9 10 11 12 13 14 15 16 17 18 19 20 21 22 \
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.016667 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
2 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0
4 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.000000 1.0 1.0 1.0 1.0
5 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0
23
0 0.0
1 1.0
2 1.0
3 0.0
4 1.0
5 0.0
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句