我在使用Numpy实现Conv2D反向传播时遇到了麻烦。输入的形状为[通道,高度,宽度]。过滤器的形状为[n_filters,通道,高度,宽度]这是我在前向传播中所做的:
ch, h, w = x.shape
Hout = (h - self.filters.shape[-2]) // self.stride + 1
Wout = (w - self.filters.shape[-1]) // self.stride + 1
a = np.lib.stride_tricks.as_strided(x, (Hout, Wout, ch, self.filters.shape[2], self.filters.shape[3]),
(x.strides[1] * self.stride, x.strides[2] * self.stride) + (
x.strides[0], x.strides[1], x.strides[2]))
out = np.einsum('ijckl,ackl->aij', a, self.filters)
我尝试这样做是为了计算dF,但它不起作用:
F = np.lib.stride_tricks.as_strided(x, (n_filt, size_filt, size_filt, dim_filt, size_filt, size_filt),
(x.strides[0], x.strides[1] * self.stride, x.strides[2] * self.stride) + (
x.strides[0], x.strides[1], x.strides[2]))
F = np.einsum('aijckl,anm->acij', F, dA_prev)
这工作很好,但是非常慢:
dA = np.zeros(shape=x.shape) # shape: [input channels, input height, input width]
dF = np.zeros(shape=self.filters.shape) # shape: [n_filters, channels, height, width]
dB = np.zeros(shape=self.bias.shape) # shape: [n_filters, 1]
size_filt = self.filters.shape[2]
for filt in range(n_filt):
y_filt = y_out = 0
while y_filt + size_filt <= size_img:
x_filt = x_out = 0
while x_filt + size_filt <= size_img:
dF[filt] += dA_prev[filt, y_out, x_out] * x[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt]
dA[:, y_filt:y_filt + size_filt, x_filt:x_filt + size_filt] += (
dA_prev[filt, y_out, x_out] * self.filters[filt])
x_filt += self.stride
x_out += 1
y_filt += self.stride
x_out += 1
dB += np.sum(dA_prev[filt])
如何有效地计算dF和dA?
我设法找到一个解决方案,用于计算dA的张量点花费了太多时间,但至少可以正常工作。
as_strided = np.lib.stride_tricks.as_strided
F = as_strided(x,
shape=(ch_img, h_filt, w_filt, dA_h, dA_w),
strides=(x.strides[0], x.strides[1] * self.stride,
x.strides[2] * self.stride,
x.strides[1], x.strides[2])
)
F = np.tensordot(F, dA_prev, axes=[(-2, -1), (1, 2)])
dF = F.transpose((3, 0, 1, 2))
pad_h = dA_h - 1
pad_w = dA_w - 1
pad_filt = np.pad(self.filters, ((0, 0), (0, 0), (pad_h, pad_h), (pad_w, pad_w)), 'constant')
sub_windows = as_strided(pad_filt,
shape=(n_filt, h, w, dA_h, dA_w, ch_filt),
strides=(pad_filt.strides[0], pad_filt.strides[2] * self.stride,
pad_filt.strides[3] * self.stride, pad_filt.strides[2],
pad_filt.strides[3], pad_filt.strides[1])
)
dA = np.tensordot(sub_windows, dA_prev[:, ::-1, ::-1], axes=[(0, 3, 4), (0, 1, 2)])
dA = dA.transpose((2, 0, 1))
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句