我有一个 keras(带有 tensorflow 后端)模型,其定义如下:
INPUT_SHAPE = [4740, 3540, 1]
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=INPUT_SHAPE))
model.add(Conv2D(2, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(4, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(8, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(16, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Conv2D(32, (4, 4), strides=(1, 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(4, 4)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
该模型只有 37,506 个可训练参数。然而,如果批量大小大于 1,它会以某种方式耗尽 K80 在 model.fit() 上的 12GB vram 资源。为什么这个模型需要这么多内存?以及如何正确计算内存需求?如何确定 Keras 模型所需的内存中的函数?批处理中每 1 个元素给我 2.15 GB。所以至少我应该能够制作一批 5。
编辑:model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 4738, 3538, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 4735, 3535, 2) 1026
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1183, 883, 2) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 1180, 880, 4) 132
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 295, 220, 4) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 292, 217, 8) 520
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 73, 54, 8) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 70, 51, 16) 2064
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 17, 12, 16) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 14, 9, 32) 8224
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 3, 2, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 3, 2, 32) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 192) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 24704
_________________________________________________________________
dropout_2 (Dropout) (None, 128) 0
_________________________________________________________________
dense_2 (Dense) (None, 4) 516
=================================================================
Total params: 37,506
Trainable params: 37,506
Non-trainable params: 0
_________________________________________________________________
第一层的输出形状为 B*4738*3538*32(B 是批量大小),大约需要 1GB * B 内存。梯度和其他激活也可能会占用一些内存。也许增加第一层的步幅会有所帮助。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句