This work demonstrates a hardware-efficient support vector machine (SVM) training algorithm via the alternative direction method of multipliers (ADMM) optimizer. Low-rank approximation is exploited to reduce the dimension of the kernel matrix by employing the Nystr\"{o}m method. Verified in four datasets, the proposed ADMM-based training algorithm with rank approximation reduces 32$\times$ of matrix dimension with only 2% drop in inference accuracy. Compared to the conventional sequential minimal optimization (SMO) algorithm, the ADMM-based training algorithm is able to achieve a 9.8$\times$10$^7$ shorter latency for training 2048 samples. Hardware design techniques, including pre-computation and memory sharing, are proposed to reduce the computational complexity by 62% and the memory usage by 60%. As a proof of concept, an epileptic seizure detector chip is designed to demonstrate the effectiveness of the proposed hardware-efficient training algorithm. The chip achieves a 153,310$\times$ higher energy efficiency and a 364$\times$ higher throughput-to-area ratio for SVM training than a high-end CPU. This work provides a promising solution for edge devices which require low-power and real-time training.
翻译:这项工作展示了一种通过乘数优化(ADMM)替代方向法的硬件高效支持矢量机培训算法。 使用Nystr\"{o}m 方法,利用低端近似法来降低内核矩阵的尺寸。 在四个数据集中验证了拟议的基于ADMM的培训算法,其等级近似法减少了32%的基数值320美元,其推导精确度仅下降2%。 与常规的连续最优化(SMO)最低算法相比,基于ADMM的培训算法能够达到9.8美元10美元 7美元,用于培训2048个样本的更短的延度。 提议硬件设计技术,包括预数和记忆共享,将计算复杂性降低62%,将记忆使用率降低60%。 作为一种概念的证明,一个癫痫收缴检测器芯片旨在证明拟议的硬件高效培训算法的有效性。 芯片实现了153,310美元的高能效和364美元,用于培训2048年的样本。 硬设计技术设计技术,包括预数和记忆共享分享技术技术,以将SPPU- 高端设备提供高点的实时和高分辨率培训。