与RGB图像卷积-RGB滤镜保持什么值？

debugcn 发表于 Dev

能够

灰度图像的卷积很简单。您具有形状过滤器，nxnx1并对输入图像进行卷积以提取所需的任何特征。

我也了解卷积如何处理RGB图像。过滤器的形状为nxnx3。但是，过滤器中的所有3个“层”都将拥有相同的内核吗？例如，如果第0层如下图所示，第1层和第2层是否也将保留确切的值？我问的是卷积神经网络，而不是传统的图像处理。我了解每个滤波器的权重都是经过学习的，并且最初是随机分配的，我是否认为每个层将具有不同的随机值是正确的？

安东密码

过滤器中的所有3个“层”都将拥有相同的内核吗？

最简洁的答案是不。更长的答案是，每层没有内核，而只有一个内核可一次处理所有输入和输出层。

下面的代码一步一步地显示了如何手动计算每个卷积，由此我们可以看到，在较高的层次上，计算是这样的：

从一批图像中获取补丁（您的情况下为BatchSize x 3x3x3）
展平[BatchSize，27]
矩阵乘以改形后的内核[27，output_filters]
添加形状的偏差[output_filters]

使用与内核矩阵的矩阵乘法一次处理所有颜色。如果考虑内核矩阵，可以看到内核矩阵中用于生成第一个过滤器的值在第一列中，而用于生成第二个过滤器的值在第二列中。因此，确实，这些值是不同的并且不会重复使用，但是它们不会分开存储或应用。

代码演练

import tensorflow as tf
import numpy as np

# Define a 3x3 kernel that after convolution will create an image with 2 filters (channels)
conv_layer = tf.keras.layers.Conv2D(filters=2, kernel_size=3)

# Lets create a random input image
starting_image = np.array( np.random.rand(1,4,4,3), dtype=np.float32)

# and process it
result = conv_layer(starting_image)
weight, bias = conv_layer.get_weights()
print('size of weight', weight.shape)
print('size of bias', bias.shape)

重量大小（3、3、3、2）

偏倚的大小（2，）

# The output of the convolution of the 4x4x3 image input 
# is a 2x2x2 output (because we don't have padding)
result.numpy()

数组（[[[[--0.34940776，-0.6426925]，

[-0.81834394，-0.16166998]，

[[-0.37515935，-0.28143463]，

[-0.60084903，-0.5310158]]]]，dtype = float32）

# Now let's see how we can recreate this using the weights

# The way convolution is done is to extract a patch
# the size of the kernel (3x3 in this case)
# We will use the first patch, the first three rows and columns and all the colors
patch = starting_image[0,:3,:3,:]
print('patch.shape' , patch.shape)

# Then we flatten the patch
flat_patch = np.reshape( patch, [1,-1] )
print('New shape is', flat_patch.shape)

patch.shape（3、3、3）

新形状是（1，27）

# next we take the weight and reshape it to be [-1,filters]
flat_weight = np.reshape( weight, [-1,2] )
print('flat_weight shape is ',flat_weight.shape)

flat_weight形状是（27，2）

# we have the patch of shape [1,27] and the weight of [27,2]
# doing a matric multiplication of the two shapes [1,27]*[27,2] = a shape of [1,2]
# which is the output we want, 2 filter outputs for this patch
output_for_patch = np.matmul(flat_patch,flat_weight)

# but we haven't added the bias yet, so lets do that
output_for_patch = output_for_patch + bias

# Finally, we can see that our manual calculation matches 
# what Conv2D does exactly for the first patch

output_for_patch

数组（[[-0.34940773，-0.64269245]]，dtype = float32）

如果将其与上面的完整卷积进行比较，我们可以看到这正是第一个补丁

数组（[[[[--0.34940776，-0.6426925]，

[-0.81834394，-0.16166998]，

[[-0.37515935，-0.28143463]，

[-0.60084903，-0.5310158]]]]，dtype = float32）

我们将对每个补丁重复此过程。如果我们想进一步优化此代码，则可以一次只传递[batch_number，27]个补丁，而不必一次只传递一个[1,27]个图像补丁，内核将立即返回[batch_number， filter_size]。

本文收集自互联网，转载请注明来源。

如有侵权，请联系[email protected] 删除。

编辑于2021-04-5

我来说两句

0条评论

登录后参与评论

来自分类Dev

Related 相关文章

文章