Questions remain on the robustness of data-driven learning methods when crossing the gap from simulation to reality. We utilize weight anchoring, a method known from continual learning, to cultivate and fixate desired behavior in Neural Networks. Weight anchoring may be used to find a solution to a learning problem that is nearby the solution of another learning problem. Thereby, learning can be carried out in optimal environments without neglecting or unlearning desired behavior. We demonstrate this approach on the example of learning mixed QoS-efficient discrete resource scheduling with infrequent priority messages. Results show that this method provides performance comparable to the state of the art of augmenting a simulation environment, alongside significantly increased robustness and steerability.
翻译:----
在从模拟到现实的跨越中,数据驱动的学习方法的鲁棒性仍存在问题。我们使用从连续学习中提出的一种称为"权重锚定"的方法,来培养和固定神经网络中的期望行为。权重锚定可以用于在不忽略或遗忘所需行为的情况下,在最优环境下解决一个学习问题的解。我们在混合QoS高效离散资源调度的学习示例中展示了这种方法。结果表明,这种方法提供了与增强模拟环境方法相当的性能,同时增加了显著的鲁棒性和可控性。