Before delegating a task to an autonomous system, a human operator may want a guarantee about the behavior of the system. This paper extends previous work on conformal prediction for functional data and conformalized quantile regression to provide conformal prediction intervals over the future behavior of an autonomous system executing a fixed control policy on a Markov Decision Process (MDP). The prediction intervals are constructed by applying conformal corrections to prediction intervals computed by quantile regression. The resulting intervals guarantee that with probability $1-\delta$ the observed trajectory will lie inside the prediction interval, where the probability is computed with respect to the starting state distribution and the stochasticity of the MDP. The method is illustrated on MDPs for invasive species management and StarCraft2 battles.
翻译:在将一项任务委托给一个自主系统之前,人类操作者可能需要对该系统的行为提供保证。本文件扩展了以前关于功能数据和符合性四分位回归的一致预测工作,以提供对一个自主系统未来行为进行一致预测的间隔,该自主系统对Markov决定程序(MDP)实施固定控制政策。预测间隔是通过对以四分位回归计算的预测间隔进行一致校正来构建的。由此得出的间隔保证,以1美元-德尔塔元的概率,观察到的轨道将位于预测间隔内,在此间隔内,对MDP的起始状态分布和随机性进行计算。该方法在入侵物种管理和StarCraft2战斗的 MDPs上作了说明。