Humans have come to rely on machines for reducing excessive information to manageable representations. But this reliance can be abused -- strategic machines might craft representations that manipulate their users. How can a user make good choices based on strategic representations? We formalize this as a learning problem, and pursue algorithms for decision-making that are robust to manipulation. In our main setting of interest, the system represents attributes of an item to the user, who then decides whether or not to consume. We model this interaction through the lens of strategic classification (Hardt et al. 2016), reversed: the user, who learns, plays first; and the system, which responds, plays second. The system must respond with representations that reveal `nothing but the truth' but need not reveal the entire truth. Thus, the user faces the problem of learning set functions under strategic subset selection, which presents distinct algorithmic and statistical challenges. Our main result is a learning algorithm that minimizes error despite strategic representations, and our theoretical analysis sheds light on the trade-off between learning effort and susceptibility to manipulation.
翻译:人类开始依赖机器来减少过度信息,将其转化为可控的表达方式。但这种依赖可能会被滥用 -- -- 战略机器可能会制造操纵用户的代理方式。 用户如何能根据战略陈述方式做出正确的选择? 我们将此正式化为一个学习问题,并追求强大的决策算法。 在我们的主要利益背景中,系统代表了一个项目给用户的属性,然后由用户来决定是否消费。我们通过战略分类(Hardt等人,2016年)来模拟这种互动,颠倒:用户,学习者,玩家,以及回应者,玩家,第二。这个系统必须用显示“除了真相之外什么都不”的表达方式来应对,但不需要披露整个真相。因此,用户面临在战略子集选择下设定的学习功能的问题,这提出了不同的算法和统计挑战。我们的主要结果是学习算法,尽管战略陈述方式存在错误,但我们的理论分析揭示了学习努力和易被操纵之间的利。