Recently, semantic search has been successfully applied to e-commerce product search and the learned semantic space(s) for query and product encoding are expected to generalize to unseen queries or products. Yet, whether generalization can conveniently emerge has not been thoroughly studied in the domain thus far. In this paper, we examine several general-domain and domain-specific pre-trained Roberta variants and discover that general-domain fine-tuning does not help generalization, which aligns with the discovery of prior art. Proper domain-specific fine-tuning with clickstream data can lead to better model generalization, based on a bucketed analysis of a publicly available manual annotated query-product pair da
翻译:最近,在电子商务产品搜索中成功地应用了语义搜索,所学的用于查询和产品编码的语义空间预计将被概括为看不见的查询或产品。然而,迄今为止尚未对这一领域中能否方便地普遍化的问题进行彻底研究。在本文件中,我们研究了若干通用域和特定域经事先培训的罗伯塔变体,发现一般域微调无助于概括化,这与先前发现的艺术相一致。 与点击流数据进行适当的特定域微调,根据对公开提供的手册附加说明的查询产品配对的分析,可以导致更好的模式化概括化。