Describing systems in terms of choices and their resulting costs and rewards offers the promise of freeing algorithm designers and programmers from specifying how those choices should be made; in implementations, the choices can be realized by optimization techniques and, increasingly, by machine-learning methods. We study this approach from a programming-language perspective. We define two small languages that support decision-making abstractions: one with choices and rewards, and the other additionally with probabilities. We give both operational and denotational semantics. In the case of the second language we consider three denotational semantics, with varying degrees of correlation between possible program values and expected rewards. The operational semantics combine the usual semantics of standard constructs with optimization over spaces of possible execution strategies. The denotational semantics, which are compositional, rely on the selection monad, to handle choice, augmented with an auxiliary monad to handle other effects, such as rewards or probability. We establish adequacy theorems that the two semantics coincide in all cases. We also prove full abstraction at base types, with varying notions of observation in the probabilistic case corresponding to the various degrees of correlation. We present axioms for choice combined with rewards and probability, establishing completeness at base types for the case of rewards without probability.
翻译:通过选项及其成本和收益的结果来描述系统,可以使算法设计人员和程序员从指定如何进行选择操作的过程中解放出来;在实现中,选择可以通过优化技术和机器学习方法实现。本文从编程语言的角度对这种方法进行了研究。我们定义了两种支持决策抽象的小语言:一种具有选项和奖励,另一种还具有概率。我们提供了操作和指示意义语义。在第二种语言的情况下,我们考虑了三种指示意义,它们之间的相关性程度不同。操作语义将标准构造的常规语义与优化相结合,以处理可能的执行策略空间。指示意义通过使用选择monad来处理选择并附加辅助monad来处理其他效果(如奖励或概率)。我们建立了两个语义的充分性定理,在所有情况下两个语义都是一致的。在基本类型上,我们还证明了完全抽象,各种观测方式对应于概率情况下的各种相关性程度。我们提出了选择与奖励和概率的公理,对于不包含概率的奖励的基本类型而言,建立了完整性。