We study the problem of online dynamic pricing with two types of fairness constraints: a "procedural fairness" which requires the proposed prices to be equal in expectation among different groups, and a "substantive fairness" which requires the accepted prices to be equal in expectation among different groups. A policy that is simultaneously procedural and substantive fair is referred to as "doubly fair". We show that a doubly fair policy must be random to have higher revenue than the best trivial policy that assigns the same price to different groups. In a two-group setting, we propose an online learning algorithm for the 2-group pricing problems that achieves $\tilde{O}(\sqrt{T})$ regret, zero procedural unfairness and $\tilde{O}(\sqrt{T})$ substantive unfairness over $T$ rounds of learning. We also prove two lower bounds showing that these results on regret and unfairness are both information-theoretically optimal up to iterated logarithmic factors. To the best of our knowledge, this is the first dynamic pricing algorithm that learns to price while satisfying two fairness constraints at the same time.
翻译:我们研究了在线动态定价问题,其公平约束有两种类型:“程序公平”,要求不同群体对拟议价格持同等期望,“实质性公平”,要求不同群体对接受价格持同等期望。同时具有程序性和实质性公平的政策被称为“轻度公平”。我们表明,双重公平政策必须随机性,其收入必须高于给不同群体分配相同价格的最微小政策。在两组环境下,我们建议为两组定价问题提供在线学习算法,即:遗憾、零程序不公平和$\tilde{O}(sqrt{T}),这要求接受价格在不同群体之间持同等期望。一个同时具有程序性和实质性公平性的政策被称作“轻度公平性”政策。我们还证明两个较低的界限,表明关于遗憾和不公平的结果既符合信息理论的最佳性,又符合反复的对数因素。据我们所知,这是第一个在满足两个公平性限制的同时学习价格的动态定价算法。