The eigenvalue decomposition (EVD) of (a batch of) Hermitian matrices of order two has a role in many numerical algorithms, of which the one-sided Jacobi method for the singular value decomposition (SVD) is the prime example. In this paper the batched EVD is vectorized, with a vector-friendly data layout and the AVX-512 SIMD instructions of Intel CPUs, alongside other key components of a real and a complex OpenMP-parallel Jacobi-type SVD method, inspired by the sequential xGESVJ routines from LAPACK. These vectorized building blocks should be portable to other platforms, with unconditional reproducibility guaranteed for the batched EVD and several others. No avoidable overflow of the results can occur with the proposed EVD or SVD. The measured accuracy of the proposed EVD often surpasses that of the xLAEV2 routines from LAPACK. While the batched EVD outperforms the matching sequence of xLAEV2 calls, speedup of the parallel SVD is modest but can be improved and is already beneficial with enough threads. Regardless of their number, the proposed SVD method gives identical results, but of somewhat lower accuracy than xGESVJ.
翻译:(一批) Exemitian II号订单的(埃米特2号) 矩阵的乙值分解(EVD) (EVD) 的(一批) Exemitian 矩阵分解(EVD), 在许多数字算法中发挥作用,其中主要的例子就是单值分解(SVD) 单值分解法方法。在本文中,分批的 EVD 是病媒化的, 配有矢值数据布局和 Intel CPU 的 AVX-512 SIMD 指示, 加上由LAPACACK 的连续的 XGESVJ 例行程序所启发的另一种真实和复杂的开放MP- cobi-plall 类的 SVD 方法的其他关键组成部分(EVD ) 。 这些矢量化的构件块应该被移植到其他平台,其中单值分单值分解的单值分数的单值解方法为单数的单方的单数方法(SVD) 。 这些矢化构件构件的构件的构件式构件构件件件的构件件件应该为单-GVJEV2号(SVD的单列的单列的单值解方法,其中的单值分值解解解的单值解的单值解的单值解解的单值解的单值解方法是其中的单值解方法的单向的单向方法是其中的优方法的主要。这些方法是其中的优方法。这些方法的典型方法的主要方法。这些方法,其中的一方法是最好的方法,其中的单式方法是最好的一,这些的单式方法的单式方法的单式方法的单式方法的单式方法的单值方法的单,这些向,其中的单的单的单的单式的单向式方法是,这些向式的单式的单的单的单值的单的单的单的单的单值的单的单为,其中的单的单的单为,这些矢的单的单值的单值的单值解制方法的单的单的单值解的单向式的单的单向式的单向式的单的单的单