In this paper, we propose a sequence-to-set method that can transform any sequence generative model based on maximum likelihood to a set generative model where we can evaluate the utility/probability of any set. An efficient importance sampling algorithm is devised to tackle the computational challenge of learning our sequence-to-set model. We present GRU2Set, which is an instance of our sequence-to-set method and employs the famous GRU model as the sequence generative model. To further obtain permutation invariant representation of sets, we devise the SetNN model which is also an instance of the sequence-to-set model. A direct application of our models is to learn an order/set distribution from a collection of e-commerce orders, which is an essential step in many important operational decisions such as inventory arrangement for fast delivery. Based on the intuition that small-sized sets are usually easier to learn than large sets, we propose a size-bias trick that can help learn better set distributions with respect to the $\ell_1$-distance evaluation metric. Two e-commerce order datasets, TMALL and HKTVMALL, are used to conduct extensive experiments to show the effectiveness of our models. The experimental results demonstrate that our models can learn better set/order distributions from order data than the baselines. Moreover, no matter what model we use, applying the size-bias trick can always improve the quality of the set distribution learned from data.
翻译:在本文中,我们提出一个序列到设定方法,可以根据最大可能性将任何序列的基因变异模型转换成一个设定的基因变异模型,这样我们就可以评估任何集集的效用/概率。设计了一个高效的重要抽样算法,以应对学习序列到设定模型的计算挑战。我们提出GRU2Set,这是我们序列到设定方法的一个实例,并且使用著名的GRU模型作为序列变异模型。为了进一步获得各集的变异代表,我们设计了SETNN模型,这也是一个序列到设定模型的例子。我们模型的一个直接应用是,从一个电子商务订单的收集中学习秩序/设置分配,这是快速交付的库存安排等许多重要业务决定中的一个基本步骤。我们基于一个直觉,即小系列通常比大系列更容易学习,我们提出的大小比小系列模型更能帮助更好地了解关于$\ell_1美元远端评价的分布。两个电子商务订单数据集数据集的设置,TMAL和HKTVMALLL的分布比我们所学的模型的模型要更精确地展示我们的数据排序。