In large-scale multi-agent systems like taxi fleets, individual agents (taxi drivers) are self-interested (maximizing their own profits) and this can introduce inefficiencies in the system. One such inefficiency is with regard to the "required" availability of taxis at different time periods during the day. Since a taxi driver can work for a limited number of hours in a day (e.g., 8-10 hours in a city like Singapore), there is a need to optimize the specific hours, so as to maximize individual as well as social welfare. Technically, this corresponds to solving a large-scale multi-stage selfish routing game with transition uncertainty. Existing work in addressing this problem is either unable to handle ``driver" constraints (e.g., breaks during work hours) or not scalable. To that end, we provide a novel mechanism that builds on replicator dynamics through ideas from behavior cloning. We demonstrate that our methods provide significantly better policies than the existing approach in terms of improving individual agent revenue and overall agent availability.
翻译:在大型多试剂系统中,如出租车车队,个别代理商(税务司机)具有自我利益(最大限度地提高其自身利润),这可能导致系统效率低下。这种效率低下的一个方面是在白天不同时间段“需要”提供出租车。由于出租车司机每天工作的时间有限(例如,在新加坡这样的城市,8-10小时),因此需要优化特定时间,以便最大限度地实现个人福利和社会福利。技术上,这相当于解决一个具有过渡不确定性的大规模多阶段自私路线游戏。在解决这一问题方面,现有的工作要么无法处理“司机”限制(例如,工时间休息),要么无法升级。为此,我们提供了一种新机制,借助行为克隆理念的再生动力。我们证明,在改善个人代理收入和总体代理商可用性方面,我们的方法提供了比现有方法更好的政策。</s>