Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this paper, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network's skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware efficient implementation with minimal to no accuracy loss. We introduce Tailor, a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network's skip connections to lower their hardware cost. The optimized hardware designs improve resource utilization by up to 34% for BRAMs, 13% for FFs, and 16% for LUTs.
翻译:深神经网络使用跳过连接来改进培训趋同。 但是, 这些跳过连接在硬件方面成本很高, 需要额外的缓冲, 并增加机内和机外内内存的利用率和带宽要求 。 在本文中, 我们显示, 在使用硬件软件代码符号方法处理时, 能够优化硬件的跳过连接。 我们争论说, 虽然网络学习需要网络的跳过连接, 但这些连接随后可以被删除或缩短, 以便提供更高效的硬件实施, 且不会造成任何准确损失 。 我们引入了“ 尾装”, 这是一种编码工具, 它的硬件认知培训算法会逐渐删除或缩短一个经过充分训练的网络的跳过连接, 以降低硬件成本 。 优化硬件设计将资源利用率提高到34% 、 FFs 13 % 和 LUTs 16 % 。