Team formation is ubiquitous in many sectors: education, labor markets, sports, etc. A team's success depends on its members' latent types, which are not directly observable but can be (partially) inferred from past performances. From the viewpoint of a principal trying to select teams, this leads to a natural exploration-exploitation trade-off: retain successful teams that are discovered early, or reassign agents to learn more about their types? We study a natural model for online team formation, where a principal repeatedly partitions a group of agents into teams. Agents have binary latent types, each team comprises two members, and a team's performance is a symmetric function of its members' types. Over multiple rounds, the principal selects matchings over agents and incurs regret equal to the deficit in the number of successful teams versus the optimal matching for the given function. Our work provides a complete characterization of the regret landscape for all symmetric functions of two binary inputs. In particular, we develop team-selection policies that, despite being agnostic of model parameters, achieve optimal or near-optimal regret against an adaptive adversary.
翻译:团队的组建在许多部门是无处不在的: 教育、劳动力市场、体育等。 团队的成功取决于其成员的潜在类型,这些类型不是直接可见的,但可以从过去的表现中(部分)推断出来。 从一位主要尝试选择团队的角度来看,这会导致自然的探索-开发交易: 保留早期发现的成功团队, 或者重新指派代理商来了解其类型; 我们研究在线团队组建的自然模式, 其中主要的代理商反复将一组代理商分成团队。 代理商具有双向潜在类型, 每个团队由两名成员组成, 团队的绩效是其成员类型的一个对称功能。 在多轮中, 主要选择匹配代理商, 并产生与成功团队数量不足和给定功能的最佳匹配相等的遗憾。 我们的工作为两种二进制投入的所有对称功能提供了完整的遗憾地貌特征描述。 特别是, 我们开发团队选择政策, 尽管对模型参数有分辨, 但要达到最佳或接近最佳的遗憾, 而不是适应性。