How does a tensorflow image op (like nn.conv2d
) expect image channels to be represented?
I'm trying to understand why my learning rate is so poor and I'm guessing it's because my input is malformed.
The conv2d
accepts all the forms you mentioned here. It doesn't care what the input range should be, as long it is within the data-type range. But from a neural network training perspective its very important that the inputs are scaled properly. Not only with the input image, but even at each layer level we want the inputs to be scaled properly. And that why techniques like batch-normalization
is present in almost all recent networks because it improves training by enabling better flow of gradients through the network. So scaling the images to [-1, +1] range (or zero mean unit variance) is important.
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加