The Discrete Periodic Radon Transform (DPRT) has been extensively used in applications that involve image reconstructions from projections. This manuscript introduces a fast and scalable approach for computing the forward and inverse DPRT that is based on the use of: (i) a parallel array of fixed-point adder trees, (ii) circular shift registers to remove the need for accessing external memory components when selecting the input data for the adder trees, (iii) an image block-based approach to DPRT computation that can fit the proposed architecture to available resources, and (iv) fast transpositions that are computed in one or a few clock cycles that do not depend on the size of the input image. As a result, for an $N\times N$ image ($N$ prime), the proposed approach can compute up to $N^{2}$ additions per clock cycle. Compared to previous approaches, the scalable approach provides the fastest known implementations for different amounts of computational resources. For example, for a $251\times 251$ image, for approximately $25\%$ fewer flip-flops than required for a systolic implementation, we have that the scalable DPRT is computed 36 times faster. For the fastest case, we introduce optimized architectures that can compute the DPRT and its inverse in just $2N+\left\lceil \log_{2}N\right\rceil+1$ and $2N+3\left\lceil \log_{2}N\right\rceil+B+2$ cycles respectively, where $B$ is the number of bits used to represent each input pixel. On the other hand, the scalable DPRT approach requires more 1-bit additions than for the systolic implementation and provides a trade-off between speed and additional 1-bit additions. All of the proposed DPRT architectures were implemented in VHDL and validated using an FPGA implementation.
翻译:discrete 定期 radon 变换 (DPRT) 已被广泛用于包含从预测中重建图像的应用程序中。 此手稿引入了快速且可缩放的方法, 用于计算前方和反向 DPRT, 其依据是使用:(一) 固定点添加树的平行阵列, (二) 循环变换登记册, 以便在选择添加树的输入数据时消除访问外部内存组件的需要, (三) 以图像块为基础计算 DPRT 的计算方法, 符合拟议架构的可用资源, (四) 在不取决于输入图像大小的一或几个时钟周期中计算快速变换位置。 (二) Ncal2 和反向 DPRT 。 因此, $N3 的图像(一Nctime n$) 元加新加新增加数, 与前前一方法相比, $Nrick2\\\\\ drick 的运行速度可以提高到 $25_rick 。 和后, DP_ dreal 执行速度要求我们只能使用 DP_ dreal 。