Factorization Machines (FM), a general predictor that can efficiently model feature interactions in linear time, was primarily proposed for collaborative recommendation and have been broadly used for regression, classification and ranking tasks. Subspace Encoding Factorization Machine (SEFM) has been proposed recently to overcome the expressiveness limitation of Factorization Machines (FM) by applying explicit nonlinear feature mapping for both individual features and feature interactions through one-hot encoding to each input feature. Despite the effectiveness of SEFM, it increases the memory cost of FM by $b$ times, where $b$ is the number of bins when applying one-hot encoding on each input feature. To reduce the memory cost of SEFM, we propose a new method called Binarized FM which constraints the model parameters to be binary values (i.e., 1 or $-1$). Then each parameter value can be efficiently stored in one bit. Our proposed method can significantly reduce the memory cost of SEFM model. In addition, we propose a new algorithm to effectively and efficiently learn proposed FM with binary constraints using Straight Through Estimator (STE) with Adaptive Gradient Descent (Adagrad). Finally, we evaluate the performance of our proposed method on eight different classification datasets. Our experimental results have demonstrated that our proposed method achieves comparable accuracy with SEFM but with much less memory cost.
翻译:集成装置(FM)是能够有效模拟线性时间特征互动的一般预测器,主要为协作性建议提出,并被广泛用于回归、分类和排序任务; 最近提出了子空间编码集成集成机(SEFM),以克服集成机(FM)的清晰非线性特征绘图,通过对每个输入特性进行一热编码,对单个特性和特征互动进行明确的非线性特征映射; 尽管SEFM具有效力,但它使调频的记忆成本增加一倍,其中美元是每个输入特性应用一热编码时的垃圾箱数量。为降低SEFM的记忆成本,我们提议了一种叫作Bimizen化调频的新方法,将模型参数限制为二元值(即1美元或1美元),从而克服调频的清晰度限制; 然后,通过对每个输入特性进行明确的非线性能分析,我们提议的SEFM模型的精确性能,最后我们提出了一种使用直径直透式调频(STEE)的硬度(SEAAAAAAAGAAAAAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADADADADADADADADAD),我们提出了一种不同级方法,我们模拟的模拟的模拟的模拟的模拟的模拟方法,我们提出了一种比较性数据。