Many signal processing and machine learning applications are built from evaluating a kernel on pairs of signals, e.g. to assess the similarity of an incoming query to a database of known signals. This nonlinear evaluation can be simplified to a linear inner product of the random Fourier features of those signals: random projections followed by a periodic map, the complex exponential. It is known that a simple quantization of those features (corresponding to replacing the complex exponential by a different periodic map that takes binary values, which is appealing for their transmission and storage), distorts the approximated kernel, which may be undesirable in practice. Our take-home message is that when the features of only one of the two signals are quantized, the original kernel is recovered without distortion; its practical interest appears in several cases where the kernel evaluations are asymmetric by nature, such as a client-server scheme. Concretely, we introduce the general framework of asymmetric random periodic features, where the two signals of interest are observed through random periodic features: random projections followed by a general periodic map, which is allowed to be different for both signals. We derive the influence of those periodic maps on the approximated kernel, and prove uniform probabilistic error bounds holding for all signal pairs from an infinite low-complexity set. Interestingly, our results allow the periodic maps to be discontinuous, thanks to a new mathematical tool, i.e. the mean Lipschitz smoothness. We then apply this generic framework to semi-quantized kernel machines (where only one signal has quantized features and the other has classical random Fourier features), for which we show theoretically that the approximated kernel remains unchanged (with the associated error bound), and confirm the power of the approach with numerical simulations.
翻译:许多信号处理和机器学习应用程序都是从对信号配对上的一个内核进行评估而建立,例如,评估一个即将到来的查询与已知信号数据库的相似性。这种非线性评价可以简化为这些信号随机Fourier特性的线性内产物:随机预测,然后是一张定期地图,复杂的指数。众所周知,这些特性的简单量化(对应于以不同周期图取代复杂的指数,该图取自二进制值,这需要它们的传输和存储,这要求它们的平稳,扭曲了接近的内核,而在实践中可能不可取。我们的接收信息是,当两个信号中只有一个的特性被四分解时,最初的内核反应可以简化为这些信号的线性内成线性内产产品:随机预测,然后是定期地图,例如客户服务器计划。具体地说,我们引入了不对称的随机周期性定期特征的总框架,通过随机的周期特性观测到两种感兴趣的信号:随机预测,然后允许一般定期地图,对于两种信号都不同。我们从两种信号中提取一个数字的内值的内值,我们从一个定期的内核显示这些精确的内核的内值显示这些图的内值。我们用来显示这些精确的内值的内值的内值的内值。