我试图可视化我的数据,但是当我绘制点时,线条似乎无处不在。
这是数据片段
Date time_began time_end activecalls date_start date_end
7/3/2020 14:08:47 14:09:30 2 7/3/2020 14:08 7/3/2020 14:09
7/3/2020 14:06:05 14:06:48 4 7/3/2020 14:06 7/3/2020 14:06
7/3/2020 15:11:36 15:12:19 6 7/3/2020 15:11 7/3/2020 15:12
7/3/2020 13:37:52 13:38:35 1 7/3/2020 13:37 7/3/2020 13:38
7/3/2020 14:19:31 14:20:14 3 7/3/2020 14:19 7/3/2020 14:20
7/3/2020 13:58:01 13:58:44 1 7/3/2020 13:58 7/3/2020 13:58
7/3/2020 16:56:32 16:57:15 3 7/3/2020 16:56 7/3/2020 16:57
7/3/2020 16:15:26 16:16:09 6 7/3/2020 16:15 7/3/2020 16:16
7/3/2020 14:35:16 14:35:59 3 7/3/2020 14:35 7/3/2020 14:35
7/3/2020 15:54:48 15:55:31 9 7/3/2020 15:54 7/3/2020 15:55
7/3/2020 16:01:39 16:02:22 3 7/3/2020 16:01 7/3/2020 16:02
7/3/2020 15:52:51 15:53:34 4 7/3/2020 15:52 7/3/2020 15:53
当我运行它时,图表如下所示:
这是我想要的样子:
您用来绘制数据的代码没有错,只是数据本身与您的期望不符。我在这里做了一些假设,但根据以前的工作,我认为您需要做两件事来纠正此问题
你已经覆盖你数据框限制你的数据只能从包含的信息'7/1/2020 16:08'
到'7/4/2020 15:10'
这里:
mask = (df['date_start'] > day1) & (df['date_end'] <= day2)
df = df.loc[mask]
I'm not sure if this is intentional just to check the first few days but your expected chart goes up to 2009 so I'd recommend removing these lines.
Looking at the figures in your data snippet and comparing to your expected output the data is quite granular from 2002 - 2009. If you want to aggregate the sum of active calls by day you want to include a groupby() with a pd.Grouper() to specify the frequency of day:
df.groupby(pd.Grouper(key='date_start', freq='D'))['activecalls'].sum()
From here you can simply plot the data adding .plot() which will by default plot a line chart as the index is now your date field (aggregated by day):
df.groupby(pd.Grouper(key='date_start', freq='D'))['activecalls'].sum().plot()
Finally, there's a few inconsistencies in your code it would be good to go through and clean these up:
date_start
并转换date_end
为pd datetime,因此可以删除此事件的第二个实例。activecalls
用另一种创建方法覆盖了该列。确定哪个是正确的,然后删除另一个。本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句