We study a dynamic allocation problem in which $T$ sequentially arriving divisible resources are to be allocated to a number of agents with linear utilities. The marginal utilities of each resource to the agents are drawn stochastically from a known joint distribution, independently and identically across time, and the central planner makes immediate and irrevocable allocation decisions. Most works on dynamic resource allocation aim to maximize the utilitarian welfare, i.e., the efficiency of the allocation, which may result in unfair concentration of resources on certain high-utility agents while leaving others' demands under-fulfilled. In this paper, aiming at balancing efficiency and fairness, we instead consider a broad collection of welfare metrics, the H\"older means, which includes the Nash social welfare and the egalitarian welfare. To this end, we first study a fluid-based policy derived from a deterministic surrogate to the underlying problem and show that for all smooth H\"older mean welfare metrics it attains an $O(1)$ regret over the time horizon length $T$ against the hindsight optimum, i.e., the optimal welfare if all utilities were known in advance of deciding on allocations. However, when evaluated under the non-smooth egalitarian welfare, the fluid-based policy attains a regret of order $\Theta(\sqrt{T})$. We then propose a new policy built thereupon, called Backward Infrequent Re-solving with Thresholding ($\mathsf{BIRT}$), which consists of re-solving the deterministic surrogate problem at most $O(\log\log T)$ times. We prove the $\mathsf{BIRT}$ policy attains an $O(1)$ regret against the hindsight optimal egalitarian welfare, independently of the time horizon length $T$. We conclude by presenting numerical experiments to corroborate our theoretical claims and to illustrate the significant performance improvement against several benchmark policies.
翻译:暂无翻译