Typically, Softmax is used in the final layer of a neural network to get a probability distribution for output classes. But the main problem with Softmax is that it is computationally expensive for large scale data sets with large number of possible outputs. To approximate class probability efficiently on such large scale data sets we can use Hierarchical Softmax. LSHTC datasets were used to study the performance of the Hierarchical Softmax. LSHTC datasets have large number of categories. In this paper we evaluate and report the performance of normal Softmax Vs Hierarchical Softmax on LSHTC datasets. This evaluation used macro f1 score as a performance measure. The observation was that the performance of Hierarchical Softmax degrades as the number of classes increase.
翻译:通常情况下,软体成份用于神经网络的最后一层,以获得输出等级的概率分布。 但软体成份的主要问题是,对于大型数据集和大量可能输出的大型数据集来说,计算成本非常昂贵。为了在大型数据集上有效估计等级概率,我们可以使用高层次软体成份。 LSHTC 数据集被用于研究高层次软体成份的性能。 LSHTC 数据集有许多类别。 在本文中,我们评估和报告LSHTC 数据集中的常规软体Vs 高度软体软体的性能。这项评估用宏观f1分作为性能衡量尺度。观察发现,高层次软体衰减的性能随着类别数量的增加而增加。