规划在多波层风险环境中的一揽子交付规划 (Planning for Package Deliveries in Risky Environments Over Multiple Epochs)

We study a risk-aware robot planning problem where a dispatcher must construct a package delivery plan that maximizes the expected reward for a robot delivering packages across multiple epochs. Each package has an associated reward for delivery and a risk of failure. If the robot fails while delivering a package, no future packages can be delivered and the cost of replacing the robot is incurred. The package delivery plan takes place over the course of either a finite or an infinite number of epochs, denoted as the finite horizon problem and infinite horizon problem, respectively. The dispatcher has to weigh the risk and reward of delivering packages during any given epoch against the potential loss of any future epoch's reward. By using the ratio between a package's reward and its risk of failure, we prove an optimal, greedy solution to both the infinite and finite horizon problems. The finite horizon problem can be solved optimally in $O(K n\log n)$ time where $K$ is the number of epochs and $n$ is the number of packages. We show an isomorphism between the infinite horizon problem and Markov Decision Processes to prove an optimal $O(n)$ time algorithm for the infinite horizon problem.

翻译：我们研究一个有风险意识的机器人规划问题, 调度员必须建造一个包件交付计划, 使在多个时代交付包件的预期回报最大化。每个包件都有相应的交付奖励和失败风险。如果机器人在交付包件时失败, 未来包件无法交付, 替换机器人的成本就会发生。包件交付计划可以在一个有限或无限数目的时代进行, 分别以有限地平线问题和无限地平线问题为标志。调度员必须权衡在任何特定时代交付包件的风险和奖励, 以防范未来包件奖励可能的损失。通过使用包件奖励与失败风险之间的比率, 我们证明对无限和有限地平面问题都是最佳的、贪婪的解决办法。有限地平线问题可以用美元( k n\log n) 来最佳地平线问题解决, 其中美元是局部地平线问题的数量, 美元是局部地平线问题的数量, 美元是包件的数量。我们在无限地平面问题和Markov决定进程之间展示了无限地平面时程问题。