On-line firms deploy suites of software platforms, where each platform is designed to interact with users during a certain activity, such as browsing, chatting, socializing, emailing, driving, etc. The economic and incentive structure of this exchange, as well as its algorithmic nature, have not been explored to our knowledge. We model this interaction as a Stackelberg game between a Designer and one or more Agents. We model an Agent as a Markov chain whose states are activities; we assume that the Agent's utility is a linear function of the steady-state distribution of this chain. The Designer may design a platform for each of these activities/states; if a platform is adopted by the Agent, the transition probabilities of the Markov chain are affected, and so is the objective of the Agent. The Designer's utility is a linear function of the steady state probabilities of the accessible states minus the development cost of the platforms. The underlying optimization problem of the Agent -- how to choose the states for which to adopt the platform -- is an MDP. If this MDP has a simple yet plausible structure (the transition probabilities from one state to another only depend on the target state and the recurrent probability of the current state) the Agent's problem can be solved by a greedy algorithm. The Designer's optimization problem (designing a custom suite for the Agent so as to optimize, through the Agent's optimum reaction, the Designer's revenue), is NP-hard to approximate within any finite ratio; however, the special case, while still NP-hard, has an FPTAS. These results generalize from a single Agent to a distribution of Agents with finite support, as well as to the setting where the Designer must find the best response to the existing strategies of other Designers. We discuss other implications of our results and directions of future research.
翻译:在线公司部署软件平台套件, 每个平台的设计是用来与用户在某一活动期间进行互动的, 例如浏览、聊天、社交、电子邮件、驱动等。 我们的知识尚未探索这种交换的经济和激励结构以及其算法性质。 我们将这种互动模式建为设计师和一个或多个代理商之间的Stackelberg游戏。 我们将一个代理商建为Markov 链的代理商, 其状态是活动; 我们假设该代理商的效用是该链中稳定状态分布的线性功能。 设计师可能为其中每一项活动/状态设计一个平台; 如果该代理商采用一个平台, 马尔科夫链的过渡概率性结构以及其算法性性质, 设计师的效用是固定状态的概率的直线性功能, 其它平台的开发成本。 代理商的潜在优化问题 -- 如何选择采用该平台的州 -- 是一个 MDP 。 如果这个 MDP 设计师拥有一个简单但可信的设计结构, 马尔科夫的过渡性结构, 其当前汇率的稳定性的稳定性的分布必须是从一个稳定状态向另一个方向。