Computational level explanations based on optimal feedback control with signal-dependent noise have been able to account for a vast array of phenomena in human sensorimotor behavior. However, commonly a cost function needs to be assumed for a task and the optimality of human behavior is evaluated by comparing observed and predicted trajectories. Here, we introduce inverse optimal control with signal-dependent noise, which allows inferring the cost function from observed behavior. To do so, we formalize the problem as a partially observable Markov decision process and distinguish between the agent's and the experimenter's inference problems. Specifically, we derive a probabilistic formulation of the evolution of states and belief states and an approximation to the propagation equation in the linear-quadratic Gaussian problem with signal-dependent noise. We extend the model to the case of partial observability of state variables from the point of view of the experimenter. We show the feasibility of the approach through validation on synthetic data and application to experimental data. Our approach enables recovering the costs and benefits implicit in human sequential sensorimotor behavior, thereby reconciling normative and descriptive approaches in a computational framework.
翻译:根据以信号为根据的噪音进行的最佳反馈控制得出的计算水平解释能够说明人类感官行为中的各种现象。然而,通常需要为一项任务承担成本功能,而人类行为的最佳性则通过比较观察到的和预测的轨迹进行评估。这里,我们采用以信号为根据的噪音进行反最佳控制,从而可以从观察到的行为推断出成本功能。为了做到这一点,我们把问题正式化为一种部分可见的Markov决策程序,并区分代理人和实验者的推断问题。具体地说,我们从国家和信仰状态的演变中得出一种概率性公式,并用信号为根据的噪音对线-赤道高斯问题的传播方程式进行近似。我们将模型扩大到从实验者的角度对部分可视性国家变量的情况。我们通过验证合成数据和应用实验数据来表明这种方法的可行性。我们的方法有助于恢复人类连续感官行为中隐含的成本和惠益,从而在计算框架中协调规范性和描述性方法。