This paper focuses on one of the most frequently visited multithreading library interfaces - ParallelFor. In this study, it is inferred that ParallelFor's end-to-end latency performance is noticeably affected by the frequency with which fetch-add-add (FAA) is called during program execution. This can be explained by ParallelFor's uniform semantics and the utilization of atomic FAA. To prove this assumption, a battery of tests was designed and conducted on diverse platforms. From the collected performance statistics and overall trends, several conclusions were drawn and a cost model is proposed to enhance performance by mitigating the influence of FAA.
翻译:本文件着重论述最经常访问的多读图书馆界面之一——“平行”。本研究报告推断,“平行”公司端到端的潜伏性能受到程序执行期间调用“再添加”(FAA)频率的明显影响,这可以用“平行”公司的统一语义学和利用原子“FAA”来解释。为证明这一假设,在不同的平台上设计并进行了一组测试。从收集的绩效统计数据和总体趋势来看,得出了若干结论,并提出了一个成本模型,通过减轻FAA的影响来提高绩效。