We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network's capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase in RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low-complexity sequences while on high-complexity sequences LSTMs perform better.
翻译:我们通过研究其能够记住的字符串序列的复杂性来探索经常性神经网络的结构。为了模拟网络的学习和推理能力,我们生成了不同复杂性的符号序列来模拟网络的培训和研究参数配置。我们比较了长期短期内存网络和封闭式经常性单元(GRUs)。我们发现,在培训时间受限时,RNN的深度增加并不一定导致更好的记忆能力。我们的结果还表明,学习率和每层单位的数量是需要调整的最重要超参数之一。一般而言,GRUs在低复杂性序列上优于LSTM网络,而在高复杂性序列上,LSTMs则表现更好。