We investigate a learning decision support system for vehicle routing, where the routing engine learns implicit preferences that human planners have when manually creating route plans (or routings). The goal is to use these learned subjective preferences on top of the distance-based objective criterion in vehicle routing systems. This is an alternative to the practice of distinctively formulating a custom VRP for every company with its own routing requirements. Instead, we assume the presence of past vehicle routing solutions over similar sets of customers, and learn to make similar choices. The learning approach is based on the concept of learning a Markov model, which corresponds to a probabilistic transition matrix, rather than a deterministic distance matrix. This nevertheless allows us to use existing arc routing VRP software in creating the actual routings, and to optimize over both distances and preferences at the same time. For the learning, we explore different schemes to construct the probabilistic transition matrix that can co-evolve with changing preferences over time. Our results on a use-case with a small transportation company show that our method is able to generate results that are close to the manually created solutions, without needing to characterize all constraints and sub-objectives explicitly. Even in the case of changes in the customer sets, our method is able to find solutions that are closer to the actual routings than when using only distances, and hence, solutions that require fewer manual changes when transformed into practical routings.
翻译:我们调查车辆路由的学习决策支持系统, 路由引擎可以学习人类规划者在人工创建路线计划( 或路由) 时的隐含偏好。 目标是在车辆路由系统基于距离的客观标准之上使用这些学到的主观偏好。 这是替代为拥有自己路由要求的每个公司专门制定定制的VRP的做法的替代办法。 相反, 我们假设过去车辆路线解决方案存在于相似的客户群中, 并学会做出类似的选择。 学习方法基于学习马可夫模式的概念, 该模式与概率性过渡矩阵相对应, 而不是确定性距离矩阵。 然而, 这允许我们使用现有的弧路由软件来创建实际路线, 并同时优化距离和偏好。 为了学习, 我们探索不同的计划, 以构建可能随着时间变化而共同变化的概率性过渡矩阵。 我们在使用一个小型运输公司的情况中发现, 我们的方法能够产生与概率性过渡矩阵矩阵矩阵相对匹配的结果, 因此, 需要更精确地使用更精确的方法, 来设定更接近于更精确的方法。