Deep Reinforcement Learning (DRL) sometimes needs a large amount of data to converge in the training procedure and in some cases, each action of the agent may produce regret. This barrier naturally motivates different data sets or environment owners to cooperate to share their knowledge and train their agents more efficiently. However, it raises privacy concerns if we directly merge the raw data from different owners. To solve this problem, we proposed a new Deep Neural Network (DNN) architecture with both global NN and local NN, and a distributed training framework. We allow the global weights to be updated by all the collaborator agents while the local weights are only updated by the agent they belong to. In this way, we hope the global weighs can share the common knowledge among these collaborators while the local NN can keep the specialized properties and ensure the agent to be compatible with its specific environment. Experiments show that the framework can efficiently help agents in the same or similar environments to collaborate in their training process and gain a higher convergence rate and better performance.
翻译:深度强化学习有时需要大量数据才能在培训程序中汇集,在某些情况下,代理人的每一项行动都可能产生遗憾。这一屏障自然促使不同的数据集或环境所有者进行合作,以便更高效地分享其知识和培训代理人。然而,如果我们直接将不同业主的原始数据合并,它会引起隐私问题。为了解决这个问题,我们提议了新的深神经网络架构,与全球NN和当地NNN并存,以及一个分布式培训框架。我们允许所有合作代理者更新全球重量,而当地重量仅由他们所属的代理人更新。通过这种方式,我们希望全球重量能够分享这些合作者的共同知识,而当地NNN可以保持专业特性,确保代理人与特定环境相容。实验表明,该框架能够有效地帮助相同或类似环境中的代理人在培训过程中进行合作,并获得更高的趋同率和更好的业绩。