我的 TensorFlow 网络的准确性似乎并不代表我的网络预测数据的实际能力

纳德尼

我有一个数据集保存data在这种形式的变量中:

data = [
    {'index': 123,
     'balance': [],
     'probaility': 0.89,
     'failed': True,
     'rank': 'A'},
    {'index': 50234,
     'balance': [],
     'probaility': 0.45,
     'failed': False,
     'rank': 'B'}]

其中data[i]['balance']是一个 44 个元素的整数列表,data有 50000 个元素。

我希望我的网络能够'rank'通过作为输入进行预测'balance'这是我用来训练和测试网络的代码:

import tensorflow as tf
import numpy as np
import multiprocessing as multip

# this labels data so that a firm in class A has label [1, 0, 0, 0, 0, 0, 0], a firm in
# class B [0, 1, 0, 0, 0, 0, 0] and so on
def calc_label(data):
    label = [0, 0, 0, 0, 0, 0, 0]
    if data['rank'] == 'A':
        label[0] = 1
    elif data['rank'] == 'B':
        label[1] = 1
    elif data['rank'] == 'C':
        label[2] = 1
    elif data['rank'] == 'D':
        label[3] = 1
    elif data['rank'] == 'E':
        label[4] = 1
    elif data['rank'] == 'F':
        label[5] = 1
    elif data['rank'] == 'Def':
        label[6] = 1
    return label


data = [
    {'index': 123,
     'balance': [],
     'probaility': 0.89,
     'failed': True,
     'rank': 'A'},
    {'index': 50234,
     'balance': [],
     'probaility': 0.45,
     'failed': False,
     'rank': 'B'}]

features = [x['balance'] for x in data]
labels = [calc_label(x) for x in data]

train_size = int(len(labels) * 0.9)
train_y = labels[:train_size]
test_y = labels[train_size:]
train_x = features[:train_size]
test_x = features[train_size:]

classes_n = len(labels[0])
nodes_per_layer = [100, 100]
hidden_layers_n = len(nodes_per_layer)
batch_size = 50000
epochs = 500
print_step = 50
saving_step = 100

x = tf.placeholder('float', [None, len(features[0])])
y = tf.placeholder('float', [None, classes_n])

current_epoch = tf.Variable(1)

layers = [{'weights': tf.Variable(tf.random_normal([len(features[0]), nodes_per_layer[0]])),
           'biases': tf.Variable(tf.random_normal([nodes_per_layer[0]]))}]

for i in range(1, hidden_layers_n):
    layers.append({'weights': tf.Variable(tf.random_normal([nodes_per_layer[i - 1], nodes_per_layer[i]])),
                   'biases': tf.Variable(tf.random_normal([nodes_per_layer[i]]))})

output_layer = {'weights': tf.Variable(tf.random_normal([nodes_per_layer[-1], classes_n])),
                'biases': tf.Variable(tf.random_normal([classes_n]))}


def neural_network_model(data):
    l = []

    l.append(tf.add(tf.matmul(x, layers[0]['weights']), layers[0]['biases']))
    l[0] = tf.nn.relu(l[0])

    for i in range(1, hidden_layers_n):
        l.append(tf.add(tf.matmul(l[i - 1], layers[i]['weights']), layers[i]['biases']))
        l[i] = tf.nn.relu(l[i])

    output = tf.add(tf.matmul(l[hidden_layers_n - 1], output_layer['weights']), output_layer['biases'])

    return output


def train_neural_network(x):
    prediction = neural_network_model(x)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))

    optimizer = tf.train.AdamOptimizer().minimize(cost)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        epoch = 1

        print('Starting training...')
        while epoch <= epochs:
            epoch_loss = 1
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
                batch_y = np.array(train_y[start:end])
                _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
                epoch_loss += c
                i += batch_size

            if (epoch + 1) % print_step == 0:
                print('Epoch', epoch + 1, 'out of',
                      '{} completed,'.format(epochs), 'loss:', epoch_loss)
                correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
                accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
                accuracy_number = accuracy.eval({x: test_x, y: test_y})
                accuracy_number_training_set = accuracy.eval({x: train_x, y: train_y})
                print('Train accuracy:', accuracy_number_training_set)
                print('Test accuracy:', accuracy_number)
            epoch += 1

train_neural_network(x)


# this functions converts predictions expressed in numbers to letters corresponding to the different ranking
# classes, for example 0 -> A, 1 -> B, 2 -> C and so on.
def convert_prediction(value):
    predict = ''
    if value == 6:
        predict = 'Def'
    elif value == 5:
        predict = 'F'
    elif value == 4:
        predict = 'E'
    elif value == 3:
        predict = 'D'
    elif value == 2:
        predict = 'C'
    elif value == 1:
        predict = 'B'
    elif value == 0:
        predict = 'A'
    return predict


def use_neural_network(input_data):
    prediction = neural_network_model(x)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        feed_list = [(k['index'], k['balance']) for k in input_data]
        indexes = [k[0] for k in feed_list]
        predictions = sess.run(tf.argmax(prediction.eval(feed_dict={x: [k[1] for k in feed_list]}), 1))
        predictions = np.array([convert_prediction(value) for value in predictions])
        result = list(zip(indexes, predictions))
        return result

if __name__ == '__main__':

    prediction = use_neural_network(data)

    print('\nCalculating errors...')

    predictions_dict = {'A': [],
                        'B': [],
                        'C': [],
                        'D': [],
                        'E': [],
                        'F': [],
                        'Def': []}

    def create_predictions_dict(index, rank):
            for j in data:
                if j['index'] == index:
                    return index, j['rank'], rank

    np = multip.cpu_count()
    p = multip.Pool(processes=np)
    predictions_list = p.starmap(create_predictions_dict, prediction[:5000])
    p.close()
    p.join()

    for elem in predictions_list:
        predictions_dict[elem[1]].append(elem)

    def is_correct(x):
        if x[1] == x[2]:
            return 1
        else:
            return 0
    correct_guesses = sum(is_correct(x) for x in predictions_list)
    correct_ratio = correct_guesses / len(data)

    print('correct:', correct_ratio)

5000个时代后,这是我得到的结果:

Epoch 5000 out of 5000 completed, loss: 9.91669559479
Train accuracy: 0.992933
Test accuracy: 0.9686
Calculating errors...
correct: 0.02336

真正不明白的是,TensorFlow内置的方法计算出的准确度怎么会这么高,而我手工计算的准确度却这么低。一般来说,当我从预测中提取数据时,似乎 TF 计算的准确度越高,我能找到的预测越不正确。

这让我想到,也许不是训练网络使猜测尽可能正确,而是训练它使猜测尽可能错误。但是,我也没有看到问题出在哪里。也许在成本函数中?

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))

- - 编辑 - -

正如答案中所建议的,我已经更正了测试用例中变量的恢复,但我的准确度仍然很低(大约 0.1)。这是更新后的代码:

import tensorflow as tf
import numpy as np
import multiprocessing as multip

# this labels data so that a firm in class A has label [1, 0, 0, 0, 0, 0, 0], a firm in
# class B [0, 1, 0, 0, 0, 0, 0] and so on
def calc_label(data):
    label = [0, 0, 0, 0, 0, 0, 0]
    if data['rank'] == 'A':
        label[0] = 1
    elif data['rank'] == 'B':
        label[1] = 1
    elif data['rank'] == 'C':
        label[2] = 1
    elif data['rank'] == 'D':
        label[3] = 1
    elif data['rank'] == 'E':
        label[4] = 1
    elif data['rank'] == 'F':
        label[5] = 1
    elif data['rank'] == 'Def':
        label[6] = 1
    return label


data = [
    {'index': 123,
     'balance': [],
     'probaility': 0.89,
     'failed': True,
     'rank': 'A'},
    {'index': 50234,
     'balance': [],
     'probaility': 0.45,
     'failed': False,
     'rank': 'B'}]


features_and_labels = [[x['balance'], calc_label(x)] for x in data]
features = [x[0] for x in features_and_labels]
labels = [x[1] for x in features_and_labels]

train_size = int(len(labels) * 0.9)
train_y = labels[:train_size]
test_y = labels[train_size:]
train_x = features[:train_size]
test_x = features[train_size:]

classes_n = len(labels[0])
nodes_per_layer = [100, 100]
hidden_layers_n = len(nodes_per_layer)
batch_size = 50000
epochs = 1000
print_step = 50
saving_step = 100

x = tf.placeholder('float', [None, len(features[0])])
y = tf.placeholder('float', [None, classes_n])

current_epoch = tf.Variable(1)

layers = [{'weights': tf.Variable(tf.random_normal([len(features[0]), nodes_per_layer[0]])),
           'biases': tf.Variable(tf.random_normal([nodes_per_layer[0]]))}]

for i in range(1, hidden_layers_n):
    layers.append({'weights': tf.Variable(tf.random_normal([nodes_per_layer[i - 1], nodes_per_layer[i]])),
                   'biases': tf.Variable(tf.random_normal([nodes_per_layer[i]]))})

output_layer = {'weights': tf.Variable(tf.random_normal([nodes_per_layer[-1], classes_n])),
                'biases': tf.Variable(tf.random_normal([classes_n]))}


def neural_network_model(data):
    l = []

    l.append(tf.add(tf.matmul(x, layers[0]['weights']), layers[0]['biases']))
    l[0] = tf.nn.relu(l[0])

    for i in range(1, hidden_layers_n):
        l.append(tf.add(tf.matmul(l[i - 1], layers[i]['weights']), layers[i]['biases']))
        l[i] = tf.nn.relu(l[i])

    output = tf.add(tf.matmul(l[hidden_layers_n - 1], output_layer['weights']), output_layer['biases'])

    return output


saver = tf.train.Saver()
tf_log = 'tf.log'


def train_neural_network(x):
    prediction = neural_network_model(x)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))

    optimizer = tf.train.AdamOptimizer().minimize(cost)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        try:
            epoch = int(open(tf_log, 'r').read().split('\n')[-2]) + 1
            print('Starting epoch:', epoch)
        except:
            epoch = 1

        if epoch != 1:
            saver.restore(sess, "model.ckpt")

        print('Starting training...')
        while epoch <= epochs:
            epoch_loss = 1
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
                batch_y = np.array(train_y[start:end])
                _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})
                epoch_loss += c
                i += batch_size

            if (epoch + 1) % print_step == 0:
                print('Epoch', epoch + 1, 'out of',
                      '{} completed,'.format(epochs), 'loss:', epoch_loss)
                correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
                accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
                accuracy_number = accuracy.eval({x: test_x, y: test_y})
                accuracy_number_training_set = accuracy.eval({x: train_x, y: train_y})
                print('Train accuracy:', accuracy_number_training_set)
                print('Test accuracy:', accuracy_number)

            if epoch == 1:
                saver.save(sess, "model.ckpt")
            if (epoch + 1) % saving_step == 0:
                saver.save(sess, "model.ckpt")
                # print('Epoch', epoch, 'completed out of', epochs, 'loss:', epoch_loss)
                with open(tf_log, 'a') as f:
                    f.write(str(epoch) + '\n')
            epoch += 1

train_neural_network(x)

# this functions converts predictions expressed in numbers to letters corresponding to the different ranking
# classes, for example 0 -> A, 1 -> B, 2 -> C and so on.
def convert_prediction(value):
    predict = ''
    if value == 6:
        predict = 'Def'
    elif value == 5:
        predict = 'F'
    elif value == 4:
        predict = 'E'
    elif value == 3:
        predict = 'D'
    elif value == 2:
        predict = 'C'
    elif value == 1:
        predict = 'B'
    elif value == 0:
        predict = 'A'
    return predict


def use_neural_network(input_data):
    prediction = neural_network_model(x)

    with tf.Session() as sess:
        for word in ['weights', 'biases']:
            output_layer[word].initializer.run()
            for variable in layers:
                variable[word].initializer.run()
        saver.restore(sess, "model.ckpt")
        feed_list = [(k['index'], k['balance']) for k in input_data]
        indexes = [k[0] for k in feed_list]
        predictions = sess.run(tf.argmax(prediction.eval(feed_dict={x: [k[1] for k in feed_list]}), 1))
        predictions = np.array([convert_prediction(value) for value in predictions])
        result = list(zip(indexes, predictions))
        return result

if __name__ == '__main__':

    prediction = use_neural_network(data)

    print('\nCalculating errors...')

    predictions_dict = {'A': [],
                        'B': [],
                        'C': [],
                        'D': [],
                        'E': [],
                        'F': [],
                        'Def': []}

    def create_predictions_dict(index, rank):
            for j in data:
                # checks which predictions are made to which firms and adds them to predictions_dict
                if j['index'] == index:
                    return index, j['rank'], rank

    np = multip.cpu_count()
    p = multip.Pool(processes=np)
    predictions_list = p.starmap(create_predictions_dict, prediction[:5000])
    p.close()
    p.join()

    for elem in predictions_list:
        predictions_dict[elem[1]].append(elem)

    def is_correct(x):
        if x[1] == x[2]:
            return 1
        else:
            return 0
    correct_guesses = sum(is_correct(x) for x in predictions_list)
    correct_ratio = correct_guesses / len(data)

    print('correct:', correct_ratio)
菲洛

在您的代码中:

def use_neural_network(input_data):
    prediction = neural_network_model(x)

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer()) #<<<<<<<<<<<<<<<<<<

tf.global_variables_initializer初始化网络中的所有变量,即它会清除任何已完成的训练相反,您想要做的是在训练结束时将网络的权重保存在检查点中,然后通过 atf.train.Saver()restore()网络变量中的学习权重加载它们

请注意,在 Tensorflow 的网站上有一个关于如何保存和恢复网络权重的深入教程

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

仅当验证准确性提高时,如何才能保存/覆盖我的TensorFlow / Keras模型?

来自分类Dev

Tensorflow我应该从图像中裁剪对象以获得更好的准确性吗?

来自分类Dev

Python:使用TensorFlow计算神经网络的准确性

来自分类Dev

TensorFlow 获得单个预测的准确性

来自分类Dev

我们如何使用神经网络计算多分类器的准确性

来自分类Dev

为什么我的Tensorflow Keras模型在训练时会输出奇怪的损失和准确性值?

来自分类Dev

我怎么知道我的 tensorflow 结构是否适合我的问题?

来自分类Dev

使用我自己的数据进行Tensorflow错误

来自分类Dev

当我使用RELU激活时,为什么我的TensorFlow网络权重和成本为NaN?

来自分类Dev

我的SELECT查询的准确性

来自分类Dev

为什么我的 TensorFlow NN 模型的预测值有上限?

来自分类Dev

TensorFlow:用我自己的图像进行训练

来自分类Dev

我的keras后端tensorflow不使用GPU?

来自分类Dev

在tensorflow中使用我自己的.csv

来自分类Dev

我的TensorFlow梯度下降有差异

来自分类Dev

在 tensorflow 中使用我自己的图像?

来自分类Dev

我的 tensorflow 代码有什么问题

来自分类Dev

我如何下载 tensorflow tf_files

来自分类Dev

Tensorflow 图像分类——消耗我的记忆

来自分类Dev

在 GPU 上运行我的 Tensorflow 模型的问题

来自分类Dev

如何提高cifar-100数据集的准确性?我目前的准确度是10%

来自分类Dev

我无法打印我的TensorFlow版本:print(tf .__ version__)

来自分类Dev

Tensorflow模型的准确性和熊猫数据丢失

来自分类Dev

Tensorflow模型的准确性和熊猫数据丢失

来自分类Dev

用于二进制经典化的 Tensorflow 神经网络;我如何使用占位符

来自分类Dev

我在学习使用 tensorflow 构建全连接中性网络时遇到了错误

来自分类Dev

Tensorflow:我可以直接在烧瓶中运行我的tensorflow模型吗?

来自分类Dev

使用 TensorFlow 从我自己的数据中使用 VGGnet 提取特征?

来自分类Dev

如何使用INRIA数据集正确测试我的准确性

Related 相关文章

  1. 1

    仅当验证准确性提高时,如何才能保存/覆盖我的TensorFlow / Keras模型?

  2. 2

    Tensorflow我应该从图像中裁剪对象以获得更好的准确性吗?

  3. 3

    Python:使用TensorFlow计算神经网络的准确性

  4. 4

    TensorFlow 获得单个预测的准确性

  5. 5

    我们如何使用神经网络计算多分类器的准确性

  6. 6

    为什么我的Tensorflow Keras模型在训练时会输出奇怪的损失和准确性值?

  7. 7

    我怎么知道我的 tensorflow 结构是否适合我的问题?

  8. 8

    使用我自己的数据进行Tensorflow错误

  9. 9

    当我使用RELU激活时,为什么我的TensorFlow网络权重和成本为NaN?

  10. 10

    我的SELECT查询的准确性

  11. 11

    为什么我的 TensorFlow NN 模型的预测值有上限?

  12. 12

    TensorFlow:用我自己的图像进行训练

  13. 13

    我的keras后端tensorflow不使用GPU?

  14. 14

    在tensorflow中使用我自己的.csv

  15. 15

    我的TensorFlow梯度下降有差异

  16. 16

    在 tensorflow 中使用我自己的图像?

  17. 17

    我的 tensorflow 代码有什么问题

  18. 18

    我如何下载 tensorflow tf_files

  19. 19

    Tensorflow 图像分类——消耗我的记忆

  20. 20

    在 GPU 上运行我的 Tensorflow 模型的问题

  21. 21

    如何提高cifar-100数据集的准确性?我目前的准确度是10%

  22. 22

    我无法打印我的TensorFlow版本:print(tf .__ version__)

  23. 23

    Tensorflow模型的准确性和熊猫数据丢失

  24. 24

    Tensorflow模型的准确性和熊猫数据丢失

  25. 25

    用于二进制经典化的 Tensorflow 神经网络;我如何使用占位符

  26. 26

    我在学习使用 tensorflow 构建全连接中性网络时遇到了错误

  27. 27

    Tensorflow:我可以直接在烧瓶中运行我的tensorflow模型吗?

  28. 28

    使用 TensorFlow 从我自己的数据中使用 VGGnet 提取特征?

  29. 29

    如何使用INRIA数据集正确测试我的准确性

热门标签

归档