The problem is to evaluate a polynomial in several variables and its gradient at a power series truncated to some finite degree with multiple double precision arithmetic. To compensate for the cost overhead of multiple double precision and power series arithmetic, data parallel algorithms for general purpose graphics processing units are presented. The reverse mode of algorithmic differentiation is organized into a massively parallel computation of many convolutions and additions of truncated power series. Experimental results demonstrate that teraflop performance is obtained in deca double precision with power series truncated at degree 152. The algorithms scale well for increasing precision and increasing degrees.
翻译:问题在于对几个变量及其梯度的多元数及其梯度进行评估,在一个电源序列中,以多精度计算,以一定的限度缩短,以多种精度计算。为了补偿多重双精度和功率序列计算的费用间接费用,提出了用于一般用途图形处理单元的数据平行算法。反算法区分模式分为一个大相平行的计算,计算许多变数和变数变数变数,并增加脱轨电力序列。实验结果显示,在10、10和电量序列中取得了双精度性能,在152度上截断了电。算法在提高精确度和增度方面规模良好。