NICE: 通过强化学习指导的整数方案拟订工作,制定强有力的时间表 (NICE: Robust Scheduling through Reinforcement Learning-Guided Integer Programming)

Integer programs provide a powerful abstraction for representing a wide range of real-world scheduling problems. Despite their ability to model general scheduling problems, solving large-scale integer programs (IP) remains a computational challenge in practice. The incorporation of more complex objectives such as robustness to disruptions further exacerbates the computational challenge. We present NICE (Neural network IP Coefficient Extraction), a novel technique that combines reinforcement learning and integer programming to tackle the problem of robust scheduling. More specifically, NICE uses reinforcement learning to approximately represent complex objectives in an integer programming formulation. We use NICE to determine assignments of pilots to a flight crew schedule so as to reduce the impact of disruptions. We compare NICE with (1) a baseline integer programming formulation that produces a feasible crew schedule, and (2) a robust integer programming formulation that explicitly tries to minimize the impact of disruptions. Our experiments show that, across a variety of scenarios, NICE produces schedules resulting in 33\% to 48\% fewer disruptions than the baseline formulation. Moreover, in more severely constrained scheduling scenarios in which the robust integer program fails to produce a schedule within 90 minutes, NICE is able to build robust schedules in less than 2 seconds on average.

翻译：整数程序为代表一系列广泛的真实世界日程安排问题提供了强大的抽象信息。尽管它们有能力模拟总体日程安排问题,但解决大规模整数程序(IP)仍然是实际的计算挑战。纳入更复杂的目标,如强力干扰,会进一步加重计算挑战。我们介绍了NICE(NICE)(NE),这是将强化学习和整数编程相结合以解决稳健日程安排问题的新技术。更具体地说,NICE利用强化学习来代表整数编程的复杂目标。我们使用NICE来确定飞行员被派到飞行机组人员日程安排,以减少中断的影响。我们比较NICE(1)与基线整数编程配制相比,产生可行的船员日程安排,和(2)强有力的整数编程配制,明确试图最大限度地减少干扰的影响。我们的实验表明,在各种情况下,NICE产生时间表,导致干扰比基线配制减少33 ⁇ -48 ⁇ 。此外,在更严格的日程安排假设中,强力整数计划无法在90分钟内产生日程安排,因此NICECE能够平均在不到2秒的情况下建立稳健的日程安排。