Choices made by individuals have widespread impacts--for instance, people choose between political candidates to vote for, between social media posts to share, and between brands to purchase--moreover, data on these choices are increasingly abundant. Discrete choice models are a key tool for learning individual preferences from such data. Additionally, social factors like conformity and contagion influence individual choice. Existing methods for incorporating these factors into choice models do not account for the entire social network and require hand-crafted features. To overcome these limitations, we use graph learning to study choice in networked contexts. We identify three ways in which graph learning techniques can be used for discrete choice: learning chooser representations, regularizing choice model parameters, and directly constructing predictions from a network. We design methods in each category and test them on real-world choice datasets, including county-level 2016 US election results and Android app installation and usage data. We show that incorporating social network structure can improve the predictions of the standard econometric choice model, the multinomial logit. We provide evidence that app installations are influenced by social context, but we find no such effect on app usage among the same participants, which instead is habit-driven. In the election data, we highlight the additional insights a discrete choice framework provides over classification or regression, the typical approaches. On synthetic data, we demonstrate the sample complexity benefit of using social information in choice models.
翻译:个人作出的选择具有广泛的影响,例如,人们在政治候选人之间选择投票,在社交媒体职位之间分享,在品牌之间选择,购买更多,关于这些选择的数据越来越丰富。分辨选择模型是学习个人偏好的关键工具,此外,合规和传染等社会因素影响个人选择。将这些因素纳入选择模型的现有方法不考虑整个社会网络,需要手工制作的特征。为了克服这些局限性,我们用图表学习来研究网络环境中的选择。我们找出三种方法,可以使用图表学习技术进行独立选择:学习选择方的演示、定期化选择模型参数以及直接从网络构建预测。我们设计每个类别的方法,并在真实世界选择数据集上测试它们,包括州一级2016年的美国选举结果和安卓亚应用程序安装和使用数据。我们显示,整合社会网络结构可以改进标准计量选择模型的预测,即多数值登录。我们提供证据,说明应用程序受到社会环境背景的影响,但我们没有发现从网络中直接构建选择模型的模型参数。我们设计了每个类别的方法,并在现实世界选择中测试这些数据集,我们又展示了一种模型。