与一名理由性临时主管解决动态首席代理问题 (Solving Dynamic Principal-Agent Problems with a Rationally Inattentive Principal)

Principal-Agent (PA) problems describe a broad class of economic relationships characterized by misaligned incentives and asymmetric information. The Principal's problem is to find optimal incentives given the available information, e.g., a manager setting optimal wages for its employees. Whereas the Principal is often assumed rational, comparatively little is known about solutions when the Principal is boundedly rational, especially in the sequential setting, with multiple Agents, and with multiple information channels. Here, we develop RIRL, a deep reinforcement learning framework that solves such complex PA problems with a rationally inattentive Principal. Such a Principal incurs a cost for paying attention to information, which can model forms of bounded rationality. We use RIRL to analyze rich economic phenomena in manager-employee relationships. In the single-step setting, 1) RIRL yields wages that are consistent with theoretical predictions; and 2) non-zero attention costs lead to simpler but less profitable wage structures, and increased Agent welfare. In a sequential setting with multiple Agents, RIRL shows opposing consequences of the Principal's inattention to different information channels: 1) inattention to Agents' outputs closes wage gaps based on ability differences; and 2) inattention to Agents' efforts induces a social dilemma dynamic in which Agents work harder, but essentially for free. Moreover, RIRL reveals non-trivial relationships between the Principal's inattention and Agent types, e.g., if Agents are prone to sub-optimal effort choices, payment schedules are more sensitive to the Principal's attention cost. As such, RIRL can reveal novel economic relationships and enables progress towards understanding the effects of bounded rationality in dynamic settings.

翻译：首席(PA)问题描述了广泛的经济关系类别,其特点是奖励和信息不对称。首席(PA)问题在于根据现有信息找到最佳激励机制,例如管理人员为其雇员确定最佳工资。首席(PA)问题通常被假定为合理,但相对不太了解当首席(B)具有一定合理性时的解决方案,特别是在顺序环境、多个代理和多个信息渠道中。这里,我们开发了一个深度强化学习框架RIRL,这个框架以理性的不敏感选择本金解决复杂的巴勒斯坦权力机构问题。这样的一位首席(B)在关注信息方面需要花费一定成本,这些信息可以模拟限制性合理合理合理性。我们使用RIRL分析经理与雇员关系中丰富的经济现象。在单步设置中,1 RIRL产生符合理论预测的工资;2 非零关注成本导致更简单但更盈利的工资结构,以及增加的代理福利。在与多个代理(L)的顺序下,RIRL显示校长对不同信息渠道的担忧的后果:1)不留意,而首席(BA)努力能够基本地分析动态(RI)与首席(L)关系中的动态(L)成本差异。