在python中使用Tensorflow 2.0训练神经网络时,我注意到训练准确性和损失在各个时期之间发生了巨大变化。我知道打印的度量标准是整个时期的平均值,但是每次平均值之后,准确性似乎都会大大下降,尽管平均值总是在增加。
损失也表现出这种行为,每个时期显着下降,但平均值增加。这是我的意思的图片(来自Tensorboard):
我已经在自己实现的所有模型中都注意到了这种行为,所以这可能是一个错误,但是我想对这种行为是否正常以及是否是正常行为有其他看法?
Also, I'm using a fairly large dataset (roughly 3 million examples). Batch size is 32 and each dot in the accuracy/loss graphs represent 50 batches (2k on the graph = 100k batches). The learning rate graph is 1:1 for batches.
It seems this phenomenon comes from the fact that the model has a high batch-to-batch variance in terms of accuracy and loss. This is illustrated if I take a graph of the model with the actual metrics per step as opposed to the average over the epoch:
Here you can see that the model can vary widely. (This graph is just for one epoch, but the fact remains).
Since the average metrics were being reported per epoch, at the beginning of the next epoch it is highly likely that the average metrics will be lower than the previous average, leading to a dramatic drop in the running average value, illustrated in red below:
如果您将红色图表中的不连续性想象为历时的过渡,那么您会明白为什么会在问题中观察到这种现象。
TL; DR对于每个批次,模型的输出差异很大。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句