In industry, feature selection is a standard but necessary step to search for an optimal set of informative feature fields for efficient and effective training of deep Click-Through Rate (CTR) models. Most previous works measure the importance of feature fields by using their corresponding continuous weights from the model, then remove the feature fields with small weight values. However, removing many features that correspond to small but not exact zero weights will inevitably hurt model performance and not be friendly to hot-start model training. There is also no theoretical guarantee that the magnitude of weights can represent the importance, thus possibly leading to sub-optimal results if using these methods. To tackle this problem, we propose a novel Learnable Polarizing Feature Selection (LPFS) method using a smoothed-$\ell^0$ function in literature. Furthermore, we extend LPFS to LPFS++ by our newly designed smoothed-$\ell^0$-liked function to select a more informative subset of features. LPFS and LPFS++ can be used as gates inserted at the input of the deep network to control the active and inactive state of each feature. When training is finished, some gates are exact zero, while others are around one, which is particularly favored by the practical hot-start training in the industry, due to no damage to the model performance before and after removing the features corresponding to exact-zero gates. Experiments show that our methods outperform others by a clear margin, and have achieved great A/B test results in KuaiShou Technology.
翻译:在行业中,地物选择是一个标准但必要的步骤,以寻找一套最佳的信息性能域,为深点击速率(CTR)模型提供高效和有效培训而寻找一套最佳的信息性能域,这是行业中一个标准但必要的步骤。大多数先前的工作都通过使用模型中相应的连续权重来测量地物字段的重要性,然后删除具有小重值的特性字段。然而,删除许多与小但非精确零权重相对应的特性将不可避免地损害模型性能,而不是对热启动模式培训的友好性能。如果使用这些方法,重量的大小并不能代表其重要性,从而可能导致次最佳的结果。为了解决这一问题,我们建议采用一种新的可学习性能极地极地化地物色(LPFS)方法,使用一种平滑的-$\ell0元函数来测量特性,然后删除功能。我们新设计的平滑度- $=0美元类似功能将LPFS+的功能推广到LPFS+++,这样就可以在深网络中插入门以控制每个特性的主动和不动状态。当下调状态,当培训完成之前,在试验后,有些门将显示实际性试验性试验性能开始后,有些门将逐渐显示。