Learning neural set functions becomes increasingly more important in many applications like product recommendation and compound selection in AI-aided drug discovery. The majority of existing works study methodologies of set function learning under the function value oracle, which, however, requires expensive supervision signals. This renders it impractical for applications with only weak supervisions under the Optimal Subset (OS) oracle, the study of which is surprisingly overlooked. In this work, we present a principled yet practical maximum likelihood learning framework, termed as EquiVSet, that simultaneously meets the following desiderata of learning set functions under the OS oracle: i) permutation invariance of the set mass function being modeled; ii) permission of varying ground set; iii) minimum prior; and iv) scalability. The main components of our framework involve: an energy-based treatment of the set mass function, DeepSet-style architectures to handle permutation invariance, mean-field variational inference, and its amortized variants. Thanks to the elegant combination of these advanced architectures, empirical studies on three real-world applications (including Amazon product recommendation, set anomaly detection, and compound selection for virtual screening) demonstrate that EquiVSet outperforms the baselines by a large margin.
翻译:在很多应用中,如产品建议和AI辅助药物发现中的复合选择,学习神经元功能在许多应用中变得越来越重要。大多数现有的工程研究方法都是在功能值或触角下进行设定功能学习,但需要昂贵的监督信号。这使得在最佳子元子集(OS)下仅进行薄弱监督的应用不切实际,其研究令人惊讶地被忽视。在这项工作中,我们提出了一个原则性但实际性最大可能性学习框架,称为EquiVSet,它同时满足OS或触角下以下学习设定功能的分层:i)设定质量功能的变异;ii)允许不同的地面设置;iii)最低前期;iv)可缩缩。我们框架的主要组成部分包括:基于能源的处理设定质量功能、深Set型结构,以处理变异性、中度变异性,及其调变异性变异性变异性。由于这些高级结构的优异性组合、关于三个现实世界应用程序的实验性研究(包括虚拟模型,通过亚马逊测算模型,通过虚拟测算系统测算大型的模型,显示亚马斯马逊基底位模型。