We study a recommender system for quantum data using the linear contextual bandit framework. In each round, a learner receives an observable (the context) and has to recommend from a finite set of unknown quantum states (the actions) which one to measure. The learner has the goal of maximizing the reward in each round, that is the outcome of the measurement on the unknown state. Using this model we formulate the low energy quantum state recommendation problem where the context is a Hamiltonian and the goal is to recommend the state with the lowest energy. For this task, we study two families of contexts: the Ising model and a generalized cluster model. We observe that if we interpret the actions as different phases of the models then the recommendation is done by classifying the correct phase of the given Hamiltonian and the strategy can be interpreted as an online quantum phase classifier.
翻译:我们使用线性背景强盗框架研究量子数据建议系统。 每轮学习者都得到可观测(上下文),并且必须从一组有限的未知量子状态(行动)中建议衡量。 学习者的目标是在每轮中最大限度地获得奖励,这是测量未知状态的结果。 我们使用这个模型来制定低能量量子国家建议问题, 环境是汉密尔顿人, 目标是建议拥有最低能量的国家。 对于这项任务, 我们研究两种背景: 伊森模型和通用的集群模型。 我们观察到,如果我们将行动解释为模型的不同阶段, 那么通过对特定汉密尔顿人的正确阶段进行分类来完成建议, 战略可以被解释为在线量子阶段分类。