We consider the fast in-place computation of the Euclidean polynomial modular remainder $R(X) \not\equiv A(X) \mod B(X)$ with $A$ and $B$ of respective degrees n and m $\le$ n. If the multiplication of two polynomials of degree $k$ can be performed with $M(k)$ operations and $O(k)$ extra space, then standard algorithms for the remainder require $O(n/m M(m))$ arithmetic operations and, apart from that of $A$ and $B$, at least $O(n-m)$ extra memory. This extra space is notably usually used to store the whole quotient $Q(X)$ such that $A = BQ + R$ with deg $R$ < deg $B$. We avoid the storage of the whole of this quotient, and propose an algorithm still using $O(n/m M(m))$ arithmetic operations but only $O(m)$ extra space.When the divisor $B$ is sparse with a constant number of non-zero terms, the arithmetic complexity bound reduces to $O(n)$. When it is allowed to use the input space of $A$ or $B$ for intermediate computations, but putting $A$ and $B$ back to their initial states after the completion of the remainder computation, we further propose an in-place algorithm (that is with its extra required space reduced to $O(1)$ only) using at most $O(n/m M(m) \log(m))$ arithmetic operations if $M(m)$ is quasi-linear and $O(n/m M(m))$ otherwise. We also propose variants that compute -- still in-place and with the same complexity bounds -- the over-place remainder $A(X) \not\equiv A(X) \mod B(X)$ and the accumulated remainder $R(X) +\not\equiv A(X) \mod B(X)$. To achieve this, we develop techniques for Toeplitz matrix operations which output is also part of the input. In-place accumulating versions are obtained for the latter and for polynomial remaindering. This is realized via further reductions to accumulated polynomial multiplication, for which in-place fast algorithms have recently been developed.
翻译:暂无翻译