使用队列Tensorflow训练模型

Thibaut Loiseleur 发表于 Dev

蒂博·洛伊瑟尔（Thibaut Loiseleur）

通过遵循和适应tensorflow教程，我在tensorflow中为回归问题设计了一个神经网络。但是，由于问题的结构（〜300.000数据点和昂贵的FTRLOptimizer的使用），即使在32 CPU的机器（我没有GPU）的情况下，我的问题也花了太长时间才能执行。

根据此评论和通过htop的快速确认，看来我有一些单线程操作，应该是feed_dict。

因此，按照此处的建议，我尝试使用队列对程序进行多线程处理。

我编写了一个带有队列的简单代码文件，以训练模型，如下所示：

import numpy as np
import tensorflow as tf
import threading

#Function for enqueueing in parallel my data
def enqueue_thread():
    sess.run(enqueue_op, feed_dict={x_batch_enqueue: x, y_batch_enqueue: y})

#Set the number of couples (x, y) I use for "training" my model
BATCH_SIZE = 5

#Generate my data where y=x+1+little_noise
x = np.random.randn(10, 1).astype('float32')
y = x+1+np.random.randn(10, 1)/100

#Create the variables for my model y = x*W+b, then W and b should both converge to 1.
W = tf.get_variable('W', shape=[1, 1], dtype='float32')
b = tf.get_variable('b', shape=[1, 1], dtype='float32')

#Prepare the placeholdeers for enqueueing
x_batch_enqueue = tf.placeholder(tf.float32, shape=[None, 1])
y_batch_enqueue = tf.placeholder(tf.float32, shape=[None, 1])

#Create the queue
q = tf.RandomShuffleQueue(capacity=2**20, min_after_dequeue=BATCH_SIZE, dtypes=[tf.float32, tf.float32], seed=12, shapes=[[1], [1]])

#Enqueue operation
enqueue_op = q.enqueue_many([x_batch_enqueue, y_batch_enqueue])

#Dequeue operation
x_batch, y_batch = q.dequeue_many(BATCH_SIZE)

#Prediction with linear model + bias
y_pred=tf.add(tf.mul(x_batch, W), b)

#MAE cost function
cost = tf.reduce_mean(tf.abs(y_batch-y_pred))

learning_rate = 1e-3
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
available_threads = 1024

#Feed the queue
for i in range(available_threads):
    threading.Thread(target=enqueue_thread).start()

#Train the model
for step in range(1000):
    _, cost_step = sess.run([train_op, cost])
    print(cost_step)
Wf=sess.run(W)
bf=sess.run(b)

这段代码不起作用，因为每次我调用x_batch时，一个y_batch也会出队，反之亦然。然后，我不将功能与相应的“结果”进行比较。

有没有简单的方法可以避免此问题？

蒂博·洛伊瑟尔（Thibaut Loiseleur）

我的错，一切正常。我被误导了，因为我在算法的每个步骤中估计了我在不同批次上的性能，还因为我的模型对于一个虚拟模型来说太复杂了（我应该有y = W * x或y = x + b之类的东西）。然后，当我尝试在控制台中打印时，我在不同的变量上多次执行sess.run，结果明显不一致。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-03-3

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

使用队列Tensorflow训练模型

使用队列Tensorflow训练模型

如何使用Tensorflow在C ++中训练模型？

Tensorflow在离线中使用预训练模型

如何使用Tensorflow在C ++中训练模型？

不使用命令行训练Tensorflow模型

使用Tensorflow C ++ API执行在skflow中训练的模型

如何使用Tensorflow数据集进行CNN模型训练

如何使用Tensorflow数据集进行CNN模型训练

使用重新训练模型时，TensorFlow教程中的NameError

如何在 Tensorflow 中使用预训练模型？

TensorFlow：无法加载训练模型

tensorflow/keras 训练模型 keyerror

tensorflow：使用队列运行器有效地提供评估/训练数据

TensorFlow模型不执行任何训练

服务器上的Tensorflow训练模型

将图片输入模型tensorflow进行训练

训练后测试 tensorflow cnn 模型

如何使用Tensorflow服务为重新训练的Inception模型提供服务？

如何将使用Keras模型训练的Tensorflow 2. *转换为.onnx格式？

如何使用Tensorflow服务为重新训练的Inception模型提供服务？

我如何在没有给出公式的情况下使用TensorFlow训练模型？

如何在 Tensorflow 中使用没有类的预训练模型？

Tensorflow 对象检测：使用新检测到的图像重新训练模型

建议如何使用Yolov4训练预训练模型？

使用@ tf.function时，在Tensorflow 2.0中训练从同一类定义的多个模型失败

我正在尝试使用Tensorflow检测眼睛。有没有预先训练好的眼睛检测模型？

tensorflow tf.contrib.learn.SVM 如何重新加载训练好的模型并使用预测对新数据进行分类

使用Google Ngrams训练语言模型

使用命名实体训练模型

如何使用SelectKBest选择的功能训练模型？