Many research works have been performed on implementation of Vitrerbi decoding algorithm on GPU instead of FPGA because this platform provides considerable flexibility in addition to great performance. Recently, the recently-introduced Tensor cores in modern GPU architectures provide incredible computing capability. This paper proposes a novel parallel implementation of Viterbi decoding algorithm based on Tensor cores in modern GPU architectures. The proposed parallel algorithm is optimized to efficiently utilize the computing power of Tensor cores. Experiments show considerable throughput improvements in comparison with previous works.
翻译:在GPU而不是FPGA实施Vitrerby解码算法方面,已经开展了许多研究工作,因为这个平台除了能发挥巨大性能之外,还具有相当大的灵活性。最近,在现代GPU结构中最近引入的Tensor核心提供了令人难以置信的计算能力。本文建议以现代GPU结构中Tensor核心为基础,同时实施新的Viterbi解码算法。拟议的平行算法得到了优化,以有效利用Tensor核心的计算能力。实验表明,与以往的工程相比,吞吐量有了相当大的改善。