作为模拟的结果,我使用Pandas解析了输出groupby()
。我在按所需方式绘制数据时遇到了一些困难。这是我要绘制的Pandas输出文件(为简单起见被抑制):
Avg-del Min-del Max-del Avg-retx Min-retx Max-retx
Prob Producers
0.3 1 8.060291 0.587227 26.709371 42.931779 5.130041 136.216642
5 8.330889 0.371387 54.468836 43.166326 3.340193 275.932170
10 1.012147 0.161975 4.320447 6.336965 2.026241 19.177802
0.5 1 8.039639 0.776463 26.053635 43.160880 5.798276 133.090358
5 4.729875 0.289472 26.717824 25.732373 2.909811 135.289244
10 1.043738 0.160671 4.353993 6.461914 2.015735 19.595393
我的y轴是延迟,我的x轴是生产者的数量。我想为概率设置误差线,为设置p=0.3
另一个误差线p=0.5
。我的python脚本如下:
import sys
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
pd.set_option('display.expand_frame_repr', False)
outputFile = 'averages.txt'
f_out = open(outputFile, 'w')
data = pd.read_csv(sys.argv[1], delimiter=",")
result = data.groupby(["Prob", "Producers"]).mean()
print "Writing to output file: " + outputFile
result_s = str(result)
f_out.write(result_s)
f_out.close()
*** Update from James ***
for prob_index in result.index.levels[0]:
r = result.loc[prob_index]
labels = [col for col in r]
lines = plt.plot(r)
[line.set_label(str(prob_index)+" "+col) for col, line in zip(labels, lines)]
ax = plt.gca()
ax.legend()
ax.set_xticks(r.index)
ax.set_ylabel('Latency (s)')
ax.set_xlabel('Number of producer nodes')
plt.show()
现在,我有4个切片的数组,每种概率一个。如何基于delay(del)和retx再次对其进行切片,并基于ave,min,max绘制误差线?
好的,这里发生了很多事情。首先,它绘制了6条线。当您的代码调用
plt.plot(np.transpose(np.array(result)[0:3, 0:3]), label = 'p=0.3')
plt.plot(np.transpose(np.array(result)[3:6, 0:3]), label = 'p=0.5')
它正在调用plt.plot
3x3数据数组。plt.plot
将此输入解释为不是x和y,而是解释为3个单独的y值序列(每个具有3个点)。对于x值,它将插值0,1,2。换句话说,它是第一个plot
调用,它正在绘制数据:
x = [1,2,3]; y = [8.060291, 8.330889, 1.012147]
x = [1,2,3]; y = [0.587227, 0.371387, 0.161975]
x = [1,2,3]; y = [26.709371, 54.468836, 4.320447]
根据您的x标签,我认为您希望这些值为x = [1,5,10]
。尝试此操作以查看是否获得所需的绘图。
# iterate over the first dataframe index
for prob_index in result.index.levels[0]:
r = result.loc[prob_index]
labels = [col for col in r]
lines = plt.plot(r)
[line.set_label(str(prob_index)+" "+col) for col, line in zip(labels, lines)]
ax = plt.gca()
ax.legend()
ax.set_xticks(r.index)
ax.set_ylabel('Latency (s)')
ax.set_xlabel('Number of producer nodes')
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句