On e-commerce platforms, predicting if two products are compatible with each other is an important functionality to achieve trustworthy product recommendation and search experience for consumers. However, accurately predicting product compatibility is difficult due to the heterogeneous product data and the lack of manually curated training data. We study the problem of discovering effective labeling rules that can enable weakly-supervised product compatibility prediction. We develop AMRule, a multi-view rule discovery framework that can (1) adaptively and iteratively discover novel rulers that can complement the current weakly-supervised model to improve compatibility prediction; (2) discover interpretable rules from both structured attribute tables and unstructured product descriptions. AMRule adaptively discovers labeling rules from large-error instances via a boosting-style strategy, the high-quality rules can remedy the current model's weak spots and refine the model iteratively. For rule discovery from structured product attributes, we generate composable high-order rules from decision trees; and for rule discovery from unstructured product descriptions, we generate prompt-based rules from a pre-trained language model. Experiments on 4 real-world datasets show that AMRule outperforms the baselines by 5.98% on average and improves rule quality and rule proposal efficiency.
翻译:在电子商务平台上,预测两种产品是否相互兼容,这是实现可靠产品建议和为消费者寻找经验的一个重要功能。然而,由于产品数据多种多样,缺乏手工整理的培训数据,很难准确预测产品兼容性。我们研究发现有效标签规则的问题,这些规则能够进行薄弱监督的产品兼容性预测。我们开发了一个多视角规则发现框架,它能够(1) 适应性和迭接性地发现新的规则,可以补充目前薄弱、受监督的模式,改进兼容性预测;(2) 从结构化属性表和非结构化产品描述中发现可解释的规则。AMRow以适应性的方式发现从大错误实例中标出规则的标签,通过推进型战略,高质量规则可以纠正当前模式的薄弱点,并迭接地完善模型。关于从结构化产品属性中发现规则,我们从决策树中产生可比较的高阶规则;以及从非结构化产品描述中发现规则,我们从事先培训的语言模型中产生迅速制定规则。在4个真实世界的数据集上进行实验,显示AMSRRue 超越了平均规则5.98的基线。