Pytorch ValueError：预期目标大小（2、13），在调用CrossEntropyLoss时得到了torch.Size（[2]）

debugcn 发表于 Dev

克里斯蒂安·杜塞特

我正在尝试训练Pytorch LSTM网络，但是ValueError: Expected target size (2, 13), got torch.Size([2])在尝试计算CrossEntropyLoss时却遇到了麻烦。我想我需要在某个地方更改形状，但是我不知道在哪里。

这是我的网络定义：

class LSTM(nn.Module):

    def __init__(self, vocab_size, embedding_dim, hidden_dim, n_layers, drop_prob=0.2):
        super(LSTM, self).__init__()

        # network size parameters
        self.n_layers = n_layers
        self.hidden_dim = hidden_dim
        self.vocab_size = vocab_size
        self.embedding_dim = embedding_dim


        # the layers of the network
        self.embedding = nn.Embedding(self.vocab_size, self.embedding_dim)
        self.lstm = nn.LSTM(self.embedding_dim, self.hidden_dim, self.n_layers, dropout=drop_prob, batch_first=True)
        self.dropout = nn.Dropout(drop_prob)
        self.fc = nn.Linear(self.hidden_dim, self.vocab_size)



    def forward(self, input, hidden):
        # Perform a forward pass of the model on some input and hidden state.
        batch_size = input.size(0)
        print(f'batch_size: {batch_size}')

        print(Input shape: {input.shape}')

        # pass through embeddings layer
        embeddings_out = self.embedding(input)
        print(f'Shape after Embedding: {embeddings_out.shape}')


        # pass through LSTM layers
        lstm_out, hidden = self.lstm(embeddings_out, hidden)
        print(f'Shape after LSTM: {lstm_out.shape}')


        # pass through dropout layer
        dropout_out = self.dropout(lstm_out)
        print(f'Shape after Dropout: {dropout_out.shape}')


        #pass through fully connected layer
        fc_out = self.fc(dropout_out)
        print(f'Shape after FC: {fc_out.shape}')

        # return output and hidden state
        return fc_out, hidden


    def init_hidden(self, batch_size):
        #Initializes hidden state
        # Create two new tensors `with sizes n_layers x batch_size x hidden_dim,
        # initialized to zero, for hidden state and cell state of LSTM


        hidden = (torch.zeros(self.n_layers, batch_size, self.hidden_dim), torch.zeros(self.n_layers, batch_size, self.hidden_dim))
        return hidden

我添加了注释，说明每个位置的网络形状。我的数据在一个名为training_dataset的TensorDataset中，具有两个属性，要素和标签。特征的形状为torch.Size（[97，3]），标签的形状为：torch.Size（[97]）。

这是网络培训的代码：

# Size parameters
vocab_size = 13
embedding_dim = 256
hidden_dim = 256       
n_layers = 2     

# Training parameters
epochs = 3
learning_rate = 0.001
clip = 1
batch_size = 2


training_loader = DataLoader(training_dataset, batch_size=batch_size, drop_last=True, shuffle=True)

net = LSTM(vocab_size, embedding_dim, hidden_dim, n_layers)
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
loss_func = torch.nn.CrossEntropyLoss()

net.train()
for e in range(epochs):
    print(f'Epoch {e}')
    print(batch_size)
    hidden = net.init_hidden(batch_size)

    # loops through each batch
    for features, labels in training_loader:

        # resets training history
        hidden = tuple([each.data for each in hidden])
        net.zero_grad()

        # computes gradient of loss from backprop
        output, hidden = net.forward(features, hidden)
        loss = loss_func(output, labels)
        loss.backward()

        # using clipping to avoid exploding gradient
        nn.utils.clip_grad_norm_(net.parameters(), clip)
        optimizer.step()

当我尝试进行培训时，出现以下错误：

Traceback (most recent call last):
  File "train.py", line 75, in <module>
    loss = loss_func(output, labels)
  File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 947, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "/usr/local/lib/python3.8/site-packages/torch/nn/functional.py", line 2422, in cross_entropy
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
  File "/usr/local/lib/python3.8/site-packages/torch/nn/functional.py", line 2227, in nll_loss
    raise ValueError('Expected target size {}, got {}'.format(
ValueError: Expected target size (2, 13), got torch.Size([2])

这也是打印语句的结果：

batch_size: 2
Input shape: torch.Size([2, 3])
Shape after Embedding: torch.Size([2, 3, 256])
Shape after LSTM: torch.Size([2, 3, 256])
Shape after Dropout: torch.Size([2, 3, 256])
Shape after FC: torch.Size([2, 3, 13])

发生某种形状错误，但我不知道在哪里。任何帮助，将不胜感激。如果相关，我正在使用Python 3.8.5和Pytorch 1.6.0。

克里斯蒂安·杜塞特

对于以后遇到此问题的任何人，我在pytorch论坛上都提出了同样的问题，并感谢ptrblock（在此处找到）给出了很好的答案。

问题是我的LSTM层具有batch_first = True，这意味着它返回输入序列（大小为（batch_size，sequence_size，vocab_size））的每个成员的输出。但是，我只想要输入序列的最后一个成员的输出（（batch_size，vocab_size）的大小。

因此，在我的前进功能中，

# pass through LSTM layers
lstm_out, hidden = self.lstm(embeddings_out, hidden)

它应该是

# pass through LSTM layers
lstm_out, hidden = self.lstm(embeddings_out, hidden)

# slice lstm_out to just get output of last element of the input sequence
lstm_out = lstm_out[:, -1]

这样解决了形状问题。错误消息有点误导人，因为它说目标实际上是错误的形状，而实际上输出是错误的形状。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-5

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章

Pytorch ValueError：预期目标大小（2、13），在调用CrossEntropyLoss时得到了torch.Size（[2]）

Pytorch ValueError：预期目标大小（2、13），在调用CrossEntropyLoss时得到了torch.Size（[2]）

ValueError：预期目标大小（128，44），得到了torch.Size（[128，100]），LSTM Pytorch

Crossentropyloss Pytorch：目标大小与Torchsize不匹配

Pytorch CNN 错误：预期输入batch_size (4) 匹配目标batch_size (64)

pytorch：如何堆叠2张量

pytorch conv2d配重

在Pytorch中实现SeparableConv2D

MobileNet ValueError：检查目标时出错：预期dense_1有4维，但得到形状为(24, 2)的数组

ValueError：检查目标时出错：预期输出具有形状 (1,) 但得到形状为 (2,) 的数组

ValueError：检查目标时出错：预期dense_2具有形状(1,)但得到形状为(14,)的数组

检查目标时出错：预期 conv2d_29 有 4 个维度，但得到了形状为 (1255, 12) 的数组

在AWS EC2上加载torch.hub.load（'pytorch / fairseq'，'roberta.large.mnli'）时出错

如何在PyTorch中合并2D卷积？

自定义conv2d操作Pytorch

pytorch conv2d的源代码在哪里？

在PyTorch中如何实现Conv2d的算法

Keras的BatchNormalization和PyTorch的BatchNorm2d之间的区别？

Pytorch上L2正则化的速度

如何从Pytorch的2D张量列表中获取列

使用PyTorch在2D张量上滑动窗口

PyTorch nn.Conv2d输出补偿

Pytorch：NN 函数逼近器，2 进 1 出

tensorflow 相当于 pytorch ReplicationPad2d

tensorflow 或 pytorch 中的 Mat2cell matlab 等效项

PyTorch conv2d不传播torch.channels_last内存格式

一维张量的Pytorch CrossEntropyLoss

PyTorch LogSoftmax和Softmax的CrossEntropyLoss

ValueError：检查模型目标时出错：预期 activation_2 具有形状 (None, 761, 1) 但得到形状为 (1, 779, 1) 的数组

Python-Tensorflow-LSTM-ValueError：检查模型目标时出错：预期dense_16具有形状（无，100）但得到形状为（16、2）的数组

python 3.x Keras ValueError：检查目标时出错：预期seq_input具有形状（无，2）但得到形状为（16，1）的数组