Highly automated assembly lines enable significant productivity gains in the manufacturing industry, particularly in mass production condition. Nonetheless, challenges persist in job scheduling for make-to-job and mass customization, necessitating further investigation to improve efficiency, reduce tardiness, promote safety and reliability. In this contribution, an advantage actor-critic based reinforcement learning method is proposed to address scheduling problems of distributed flexible assembly lines in a real-time manner. To enhance the performance, a more condensed environment representation approach is proposed, which is designed to work with the masks made by priority dispatching rules to generate fixed and advantageous action space. Moreover, a Monte-Carlo tree search based soft shielding component is developed to help address long-sequence dependent unsafe behaviors and monitor the risk of overdue scheduling. Finally, the proposed algorithm and its soft shielding component are validated in performance evaluation.
翻译:暂无翻译