Describing systems in terms of choices and their resulting costs and rewards offers the promise of freeing algorithm designers and programmers from specifying how those choices should be made; in implementations, the choices can be realized by optimization techniques and,increasingly, by machine-learning methods. We study this approach from a programming-language perspective. We define two small languages that support decision-making abstractions: one with choices and rewards, and the other additionally with probabilities. We give both operational and denotational semantics. In the case of the second language we consider three denotational semantics, with varying degrees of correlation between possible program values and expected rewards. The operational semantics combine the usual semantics of standard constructs with optimization over spaces of possible execution strategies. The denotational semantics, which are compositional rely on the selection monad, to handle choice, augmented with an auxiliary monad to handle other effects, such as rewards or probability. We establish adequacy theorems that the two semantics coincide in all cases. We also prove full abstraction at base types, with varying notions of observation in the probabilistic case corresponding to the various degrees of correlation. We present axioms for choice combined with rewards and probability, establishing completeness at base types for the case of rewards without probability.
翻译:我们从选择和回报的角度来研究这一方法。我们定义了两种支持决策抽象的小型语言:一种是选择和奖励,另一种是可能性;我们给出了操作性和批注性语义学。在第二种语言中,我们考虑的是三种解记性语义学,在可能的程序值和预期的奖赏之间有着不同程度的相互关系。操作性语义学将标准构造的通常语义与可能的执行战略空间的优化结合起来。解记性语义学,这些语义学取决于选择的语义学,处理选择,并辅之以一个辅助的单词,以处理其他影响,例如奖赏或概率。我们确定两种语义学在一切情况下都一致。我们还证明在基本类型上完全抽象,在概率和概率方面,我们没有不同的概念,在概率方面,在概率方面,在概率方面,在概率方面,在概率方面,在概率方面,在概率方面,在概率方面。