The development of machine learning algorithms has been gathering relevance to address the increasing modelling complexity of manufacturing decision-making problems. Reinforcement learning is a methodology with great potential due to the reduced need for previous training data, i.e., the system learns along time with actual operation. This study focuses on the implementation of a reinforcement learning algorithm in an assembly problem of a given object, aiming to identify the effectiveness of the proposed approach in the optimisation of the assembly process time. A model-free Q-Learning algorithm is applied, considering the learning of a matrix of Q-values (Q-table) from the successive interactions with the environment to suggest an assembly sequence solution. This implementation explores three scenarios with increasing complexity so that the impact of the Q-Learning\textsc's parameters and rewards is assessed to improve the reinforcement learning agent performance. The optimisation approach achieved very promising results by learning the optimal assembly sequence 98.3% of the times.
翻译:机器学习算法的发展已经越来越受到关注,以应对制造决策问题愈加复杂的建模需求。强化学习是一种具有巨大潜力的方法,因为它减少了需要先前训练数据的需求,即系统会随着实际操作而学习。本研究专注于在给定对象的组装问题中实施强化学习算法,旨在确定所提出方法在优化组装过程时间方面的效果。应用无模型Q-Learning算法,考虑从与环境的连续交互中学习一组Q值(Q-table)来建议组装顺序解决方案。本研究探索了三种场景,逐渐增加复杂度,以评估Q-Learning算法的参数和奖励对于提高强化学习代理性能的影响。优化方法通过学习最优组装顺序,在98.3%的情况下取得了非常优异的结果。