使用Numpy在卷积层中实现反向传播

debugcn 发表于 Dev

Shachaf Zohar

我在使用Numpy实现Conv2D反向传播时遇到了麻烦。输入的形状为[通道，高度，宽度]。过滤器的形状为[n_filters，通道，高度，宽度]这是我在前向传播中所做的：

ch, h, w = x.shape
Hout = (h - self.filters.shape[-2]) // self.stride + 1
Wout = (w - self.filters.shape[-1]) // self.stride + 1

a = np.lib.stride_tricks.as_strided(x, (Hout, Wout, ch, self.filters.shape[2], self.filters.shape[3]),
                                    (x.strides[1] * self.stride, x.strides[2] * self.stride) + (
                                    x.strides[0], x.strides[1], x.strides[2]))
out = np.einsum('ijckl,ackl->aij', a, self.filters)

我尝试这样做是为了计算dF，但它不起作用：

F = np.lib.stride_tricks.as_strided(x, (n_filt, size_filt, size_filt, dim_filt, size_filt, size_filt),
                                    (x.strides[0], x.strides[1] * self.stride, x.strides[2] * self.stride) + (
                                    x.strides[0], x.strides[1], x.strides[2]))
F = np.einsum('aijckl,anm->acij', F, dA_prev)

这工作很好，但是非常慢：

dA = np.zeros(shape=x.shape)  # shape: [input channels, input height, input width]
dF = np.zeros(shape=self.filters.shape)  # shape: [n_filters, channels, height, width]
dB = np.zeros(shape=self.bias.shape)  # shape: [n_filters, 1]
size_filt = self.filters.shape[2]
for filt in range(n_filt):
    y_filt = y_out = 0
    while y_filt + size_filt <= size_img:
        x_filt = x_out = 0
        while x_filt + size_filt <= size_img:
            dF[filt] += dA_prev[filt, y_out, x_out] * x[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt]

            dA[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt] += (
                    dA_prev[filt, y_out, x_out] * self.filters[filt])

            x_filt += self.stride
            x_out += 1

        y_filt += self.stride
        x_out += 1
    dB += np.sum(dA_prev[filt])

如何有效地计算dF和dA？

Shachaf Zohar

我设法找到一个解决方案，用于计算dA的张量点花费了太多时间，但至少可以正常工作。

as_strided = np.lib.stride_tricks.as_strided

F = as_strided(x,
               shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
               strides=(x.strides[0], x.strides[1] * self.stride,
                        x.strides[2] * self.stride,
                        x.strides[1], x.strides[2])
               )
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))

pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
                         shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
                         strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
                                  pad_filt.strides[3] * self.stride, pad_filt.strides[2],
                                  pad_filt.strides[3], pad_filt.strides[1])
                         )

dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。