In the expeditionary sciences, spatiotemporally varying environments -- hydrothermal plumes, algal blooms, lava flows, or animal migrations -- are ubiquitous. Mobile robots are uniquely well-suited to study these dynamic, mesoscale natural environments. We formalize expeditionary science as a sequential decision-making problem, modeled using the language of partially-observable Markov decision processes (POMDPs). Solving the expeditionary science POMDP under real-world constraints requires efficient probabilistic modeling and decision-making in problems with complex dynamics and observational models. Previous work in informative path planning, adaptive sampling, and experimental design have shown compelling results, largely in static environments, using data-driven models and information-based rewards. However, these methodologies do not trivially extend to expeditionary science in spatiotemporal environments: they generally do not make use of scientific knowledge such as equations of state dynamics, they focus on information gathering as opposed to scientific task execution, and they make use of decision-making approaches that scale poorly to large, continuous problems with long planning horizons and real-time operational constraints. In this work, we discuss these and other challenges related to probabilistic modeling and decision-making in expeditionary science, and present some of our preliminary work that addresses these gaps. We ground our results in a real expeditionary science deployment of an autonomous underwater vehicle (AUV) in the deep ocean for hydrothermal vent discovery and characterization. Our concluding thoughts highlight remaining work to be done, and the challenges that merit consideration by the reinforcement learning and decision-making community.
翻译:在远征科学中,热液羽流、藻华花、熔岩流或动物迁移等突发环境环境无处不在。移动机器人特别适合研究这些动态的、中等的自然环境。我们将远征科学正式确定为一个顺序决策问题,采用部分可观测的马尔科夫决策过程(POMDPs)的语言为模型。在现实世界的制约下解决远征科学POMDP,需要高效率的概率建模和决策,解决复杂的动态和观察模型的问题。以往在信息化路径规划、适应性取样和实验设计方面的工作已经显示出令人信服的结果,主要是在静态环境中,利用数据驱动模型和信息为基础的奖赏。然而,这些方法并非微不足道地延伸至广度环境中的远征科学:它们通常不使用州际动态方程式等科学知识,它们侧重于信息采集而不是科学任务执行,它们利用决策方法,这些方法在大规模、持续的问题上一直存在,在长期规划、实时强化采样和实验性操作性设计方面出现了令人信服的结果。 在这项工作中,这些方法一般不使用诸如州际动态方动态方阵列等科学知识,它们侧重于信息采集信息采集,它们,它们所做的决策方法,通过我们当前在深度的深度规划和远期的深度科学上的一些研究,我们进行中,我们进行中,这些研究,在深度的深度的深度定位和远期研究,我们进行中,我们进行中,我们进行这些研究,在深度的深度的深度研究,在深度的探索的探索的探索的探索的探索的探索中,这些研究,这些研究,这些研究,我们的研究中,这些研究。