We explore relations between the hyper-parameters of a recurrent neural network (RNN) and the complexity of string sequences it is able to memorize. We compare long short-term memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase of RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low complexity sequences while on high complexity sequences LSTMs perform better.
翻译:我们探讨经常神经网络(RNN)的超参数与其能够记住的字符串序列的复杂性之间的关系。我们比较了长期短期内存(LSTM)网络和封闭式经常性单元(GRUs),我们发现,在培训时间受限时,RNN深度的增加并不一定导致更好的记忆能力。我们的结果还表明,学习率和每层单位的数量是最重要的需要调整的超参数之一。一般而言,GRUs在低复杂序列上优于LSTM网络,而在高复杂序列上,LSTMs的表现更好。