Conformal Prediction (CP) is a powerful statistical machine learning tool to construct uncertainty sets with coverage guarantees, which has fueled its extensive adoption in generating prediction regions for decision-making tasks, e.g., Trajectory Optimization (TO) in uncertain environments. However, existing methods predominantly employ a sequential scheme, where decisions rely unidirectionally on the prediction regions, and consequently the information from decision-making fails to be fed back to instruct CP. In this paper, we propose a novel Feedback-Based CP (Fb-CP) framework for shrinking-horizon TO with a joint risk constraint over the entire mission time. Specifically, a CP-based posterior risk calculation method is developed by fully leveraging the realized trajectories to adjust the posterior allowable risk, which is then allocated to future times to update prediction regions. In this way, the information in the realized trajectories is continuously fed back to the CP, enabling attractive feedback-based adjustments of the prediction regions and a provable online improvement in trajectory performance. Furthermore, we theoretically prove that such adjustments consistently maintain the coverage guarantees of the prediction regions, thereby ensuring provable safety. Additionally, we develop a decision-focused iterative risk allocation algorithm with theoretical convergence analysis for allocating the posterior allowable risk which closely aligns with Fb-CP. Furthermore, we extend the proposed method to handle distribution shift. The effectiveness and superiority of the proposed method are demonstrated through benchmark experiments.
翻译:保形预测(Conformal Prediction,CP)是一种强大的统计机器学习工具,能够构建具有覆盖保证的不确定性集合,这推动了其在为决策任务(例如不确定环境下的轨迹优化)生成预测区域中的广泛应用。然而,现有方法主要采用顺序方案,即决策单向依赖于预测区域,从而导致来自决策过程的信息无法被反馈以指导CP。本文提出了一种新颖的基于反馈的保形预测框架,用于处理具有整个任务时间联合风险约束的收缩时域轨迹优化问题。具体而言,通过充分利用已实现的轨迹来调整后验允许风险,我们开发了一种基于CP的后验风险计算方法,然后将该风险分配给未来时刻以更新预测区域。通过这种方式,已实现轨迹中的信息被持续反馈给CP,从而实现对预测区域具有吸引力的基于反馈的调整,并能在理论上证明在线轨迹性能的提升。此外,我们从理论上证明了此类调整能始终保持预测区域的覆盖保证,从而确保可证明的安全性。同时,我们开发了一种决策导向的迭代风险分配算法,并提供了理论收敛性分析,用于分配与Fb-CP紧密契合的后验允许风险。此外,我们将所提方法扩展到处理分布偏移的情况。基准实验证明了所提方法的有效性和优越性。