Entity Set Expansion is an important NLP task that aims at expanding a small set of entities into a larger one with items from a large pool of candidates. In this paper, we propose GausSetExpander, an unsupervised approach based on optimal transport techniques. We propose to re-frame the problem as choosing the entity that best completes the seed set. For this, we interpret a set as an elliptical distribution with a centroid which represents the mean and a spread that is represented by the scale parameter. The best entity is the one that increases the spread of the set the least. We demonstrate the validity of our approach by comparing to state-of-the art approaches.
翻译:实体设置扩展是一项重要的NLP任务,旨在将一小批实体扩大为规模较大的实体,其项目来自大批候选人。在本文中,我们提议GausSetExtander,这是基于最佳运输技术的一种不受监督的方法。我们提议将问题重新定义为选择最能完成种子组的实体。为此,我们将一组产品解释为一个半机器人的椭圆分布,代表比例参数代表的平均值和扩散值。最好的实体是增加成套产品最小扩散量的实体。我们通过比较最先进的方法来证明我们的方法的有效性。