To replace data augmentation, this paper proposed a method called SLAP to intensify experience to speed up machine learning and reduce the sample size. SLAP is a model-independent protocol/function to produce the same output given different transformation variants. SLAP improved the convergence speed of convolutional neural network learning by 83% in the experiments with Gomoku game states, with only one eighth of the sample size compared with data augmentation. In reinforcement learning for Gomoku, using AlphaGo Zero/AlphaZero algorithm with data augmentation as baseline, SLAP reduced the number of training samples by a factor of 8 and achieved similar winning rate against the same evaluator, but it was not yet evident that it could speed up reinforcement learning. The benefits should at least apply to domains that are invariant to symmetry or certain transformations. As future work, SLAP may aid more explainable learning and transfer learning for domains that are not invariant to symmetry, as a small step towards artificial general intelligence.
翻译:为了取代数据增强,本文建议了一种称为SLAP的方法,以强化经验,加快机器学习,缩小样本规模。SLAP是一种模式独立的协议/功能,可以产生不同变异变异的相同产出。SLAP在与Gomoko游戏州实验中将进化神经网络学习的趋同速度提高了83%,与数据增强相比,样本规模只有八分之一。在为Gomoku强化学习时,使用以数据增强为基准的AlphaGo Zero/AlphaZero算法,SLAP将培训样本数量减少了8倍,并取得了与同一评价员类似的成功率,但尚不清楚它能够加快增强学习的速度。其好处至少应适用于不易对称或某些变异的领域。作为今后的工作,SLAPP可以帮助对非变异性一般情报领域进行更易解的学习和转移学习,作为向人工一般情报的一小步。