In this paper, we consider a Linear Program (LP)-based online resource allocation problem where a decision maker accepts or rejects incoming customer requests irrevocably in order to maximize expected revenue given limited resources. At each time, a new order/customer/bid is revealed with a request of some resource(s) and a reward. We consider a stochastic setting where all the orders are i.i.d. sampled from an unknown distribution. Such formulation gives rise to many classic applications such as the canonical (quantity-based) network revenue management problem and the Adwords problem. Instead of focusing only on regret minimization, this paper aims to provide fairness guarantees while maintaining low regret. Our definition of fairness is that a fair online algorithm should treat similar agents/customers similarly and the decision made for similar individuals should be consistent over time. We define the fair offline solution as the analytic center of the offline optimal solution set, and define \textit{cumulative unfairness} as the cumulative deviation from the online solutions to the fair offline solution. We propose a fair algorithm that uses an interior-point LP solver and dynamically detects unfair resource spending. Our algorithm can control cumulative unfairness on the scale of order $O(\log(T))$, while maintaining the regret to be bounded without dependency on $T$. Moreover, we partially remove the nondegeneracy assumptions used in early results in the literature. This paper only requires the nondegeneracy condition for the binding constraints, and allows the existence of multiple optimal solutions.
翻译:在本文中, 我们考虑一个基于线性程序( LP) 的在线资源分配问题, 即决策者接受或拒绝来港客户的要求不可撤销地接受或拒绝, 以便根据有限的资源实现预期收入最大化。 每次, 都会披露一个新的订单/ 客户/ 投标, 并有某些资源的要求和奖赏。 我们考虑一个随机的设置, 所有的订单都来自未知的分布样本。 这种配方产生了许多经典应用程序, 如 Canical( 数量基) 网络收入管理问题和 Adwords 问题。 本文的目的不是仅仅侧重于最小化, 而是提供公平性保证, 同时保持低度的遗憾。 我们的公平定义是, 公平在线算法应该同样对待类似的代理商/ 客户, 为类似个人做出的决定应该随着时间的推移一致。 我们定义的公平离线性解决方案是离线最佳解决方案的解析中心, 并且将 纯度 { 累积性 { 不公平 { { 不公平 } 定义为从在线解决方案到公平离线性解决方案的累积性 。 我们提议一个公平的算法, 使用不拘谨度 。