In software, text is often represented using Unicode formats (UTF-8 and UTF-16). We frequently have to convert text from one format to the other, a process called transcoding. Popular transcoding functions are slower than state-of-the-art disks and networks. These transcoding functions make little use of the single-instruction-multiple-data (SIMD) instructions available on commodity processors. By designing transcoding algorithms for SIMD instructions, we multiply the speed of transcoding on current systems (x64 and ARM). To ensure reproducibility, we make our software freely available as an open source library.
翻译:在软件中,文本通常使用Unicode格式(UTF-8和UTF-16)来表示。我们经常不得不将文本从一种格式转换为另一种格式,即所谓的转码程序。普通的转码功能比最先进的磁盘和网络慢。这些转码功能很少使用商品处理器上的单一指示-多数据(SIMD)指令。通过为SIMD指令设计转码算法,我们将当前系统(x64和ARM)上的转码速度乘以。为了确保可复制性,我们免费提供软件作为开放源库。