Tensorflow RNN细胞权重共享

Guillaume Chevalier 发表于 Dev

纪尧姆·谢瓦利埃（Guillaume Chevalier）

我想知道下面的代码是否共享两个堆叠单元的权重：

cell = rnn_cell.GRUCell(hidden_dim)
stacked_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)

如果没有共享，如何在任何RNN中强制共享？

注意：我可能更想在嵌套的输入-输出连接的RNN配置中共享权重，其中第一层对于第二层的每个输入都克隆了很多次（例如，其中第一层代表字母，第二层代表收集的单词的句子）从迭代第一层的输出）

以撒玛利

通过执行以下脚本，您可以看到未共享权重：

import tensorflow as tf

with tf.variable_scope("scope1") as vs:
  cell = tf.nn.rnn_cell.GRUCell(10)
  stacked_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
  stacked_cell(tf.Variable(np.zeros((100, 100), dtype=np.float32), name="moo"), tf.Variable(np.zeros((100, 100), dtype=np.float32), "bla"))
  # Retrieve just the LSTM variables.
  vars = [v.name for v in tf.all_variables()
                    if v.name.startswith(vs.name)]
  print vars

您将看到，除了虚拟变量之外，它还返回两组GRU权重：具有“ Cell1”的权重和具有“ Cell0”的权重。

为了使它们共享，您可以实现自己的单元类，该单元类GRUCell通过始终使用相同的变量范围来继承并始终重用权重：

import tensorflow as tf

class SharedGRUCell(tf.nn.rnn_cell.GRUCell):
    def __init__(self, num_units, input_size=None, activation=tf.nn.tanh):
        tf.nn.rnn_cell.GRUCell.__init__(self, num_units, input_size, activation)
        self.my_scope = None

    def __call__(self, a, b):
        if self.my_scope == None:
            self.my_scope = tf.get_variable_scope()
        else:
            self.my_scope.reuse_variables()
        return tf.nn.rnn_cell.GRUCell.__call__(self, a, b, self.my_scope)

with tf.variable_scope("scope2") as vs:
  cell = SharedGRUCell(10)
  stacked_cell = tf.nn.rnn_cell.MultiRNNCell([cell] * 2)
  stacked_cell(tf.Variable(np.zeros((20, 10), dtype=np.float32), name="moo"), tf.Variable(np.zeros((20, 10), dtype=np.float32), "bla"))
  # Retrieve just the LSTM variables.
  vars = [v.name for v in tf.all_variables()
                    if v.name.startswith(vs.name)]
  print vars