如何在Tensorflow 2.0中累积梯度？

debugcn 发表于 Dev

那霸布山SN

我正在用训练模型tensorflow 2.0。我的训练集中的图像具有不同的分辨率。我建立的模型可以处理可变分辨率（转换层，然后进行全局平均）。我的训练集很小，我想在一个批次中使用完整的训练集。

由于我的图片分辨率不同，因此无法使用model.fit()。因此，我计划将每个样本分别通过网络传递，累积误差/梯度，然后应用一个优化程序步骤。我可以计算损失值，但是我不知道如何累计损失/梯度。如何累积损失/梯度，然后应用单个优化程序步骤？

代码：

for i in range(num_epochs):
    print(f'Epoch: {i + 1}')
    total_loss = 0
    for j in tqdm(range(num_samples)):
        sample = samples[j]
        with tf.GradientTape as tape:
            prediction = self.model(sample)
            loss_value = self.loss_function(y_true=labels[j], y_pred=prediction)
        gradients = tape.gradient(loss_value, self.model.trainable_variables)
        self.optimizer.apply_gradients(zip(gradients, self.model.trainable_variables))
        total_loss += loss_value

    epoch_loss = total_loss / num_samples
    print(f'Epoch loss: {epoch_loss}')

拉米罗RC

如果我从以下陈述中正确理解：

如何累积损失/梯度，然后应用单个优化程序步骤？

@Nagabhushan尝试累积梯度，然后将优化应用于（平均）累积梯度。@TensorflowSupport提供的答案无法回答。为了仅执行一次优化并从多个磁带上累积梯度，可以执行以下操作：

for i in range(num_epochs):
    print(f'Epoch: {i + 1}')
    total_loss = 0

    # get trainable variables
    train_vars = self.model.trainable_variables
    # Create empty gradient list (not a tf.Variable list)
    accum_gradient = [tf.zeros_like(this_var) for this_var in train_vars]

    for j in tqdm(range(num_samples)):
        sample = samples[j]
        with tf.GradientTape as tape:
            prediction = self.model(sample)
            loss_value = self.loss_function(y_true=labels[j], y_pred=prediction)
        total_loss += loss_value

        # get gradients of this tape
        gradients = tape.gradient(loss_value, train_vars)
        # Accumulate the gradients
        accum_gradient = [(acum_grad+grad) for acum_grad, grad in zip(accum_gradient, gradients)]


    # Now, after executing all the tapes you needed, we apply the optimization step
    # (but first we take the average of the gradients)
    accum_gradient = [this_grad/num_samples for this_grad in accum_gradient]
    # apply optimization step
    self.optimizer.apply_gradients(zip(accum_gradient,train_vars))
        

    epoch_loss = total_loss / num_samples
    print(f'Epoch loss: {epoch_loss}')

在训练循环中应避免使用tf.Variable（），因为在尝试将代码作为图形执行时会产生错误。如果您在训练函数中使用tf.Variable（），然后用“ @ tf.function”装饰它或应用“ tf.function（my_train_fcn）”以获得图形函数（即为了提高性能），则执行将增加错误。发生这种情况是因为对tf.Variable函数的跟踪导致的行为与渴望执行（分别为重新利用或创建）时所观察到的行为不同。您可以在tensorflow帮助页面中找到更多信息。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-2

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

如何在Tensorflow 2.0中累积梯度？

如何在Tensorflow 2.0中累积梯度？

如何在Tensorflow 2中实现小批量梯度下降？

如何在TensorFlow 2中获得Keras张量的值？

如何在TensorFlow 2 Eager中获得learning_phase？

RNN中的梯度累积

如何在 TensorFlow 中计算次梯度？

如何在tensorflow中为word2vec模型提供特定的单词

如何在tensorflow2.0中导入'tf.contrib.seq2seq.dynamic_decoder'？

如何在预训练的TensorFlow 2模型中访问和可视化权重？

如何在Jupyter中将TensorFlow 2模型的结果保存到文本文件中？

如何在Tensorflow 2中解码示例（从1.12版本移植）

我如何在Tensorflow 2 LSTM培训中屏蔽多输出？

如何在Tensorflow Object Detection API v2中同时训练和评估

如何在Tensorflow 2.x中打印准确性和其他指标？

如何在Keras / TensorFlow中可视化RNN / LSTM梯度？

如何在Tensorflow 2.0中计算输出WRT输入的梯度

如何在TensorFlow联合SGD中操纵客户端梯度

如何在 tensorflow-r1.0 中裁剪 grad_and_var 元组上的梯度范数？

如何在TensorFlow变量上获得损耗的梯度？

在Tensorflow 2中将梯度可视化为热图

如何在Keras中找到预训练的InceptionResNetV2模型不同层中的激活形状-Tensorflow 2.0

如何在tensorflow2中制作这样的数据集：<PrefetchDataset形状：（（），（）），类型：（tf.string，tf.string）>

如何在单行中设置反梯度

如何在梯度下降中调整权重

Tensorflow中的比例梯度运算

Tensorflow 2.0中的梯度计算

如何在Tensorflow中实现重量噪声

如何在tensorflow中列出图中的变量？

Tensorflow：如何在张量中修改值

如何在Tensorflow中打印标志描述？