We consider the algorithm by Ferson et al. (Reliable computing 11(3), p. 207-233, 2005) designed for solving the NP-hard problem of computing the maximal sample variance over interval data, motivated by robust statistics (in fact, the formulation can be written as a nonconvex quadratic program with a specific structure). First, we propose a new version of the algorithm improving its original time bound $O(n^2 2^\omega)$ to $O(n \log n+n\cdot 2^\omega)$, where $n$ is number of input data and $\omega$ is the clique number in a certain intersection graph. Then we treat input data as random variables as it is usual in statistics) and introduce a natural probabilistic data generating model. We get $2^\omega = O(n^{1/\log\log n})$ and $\omega = O(\log n / \log\log n)$ on average. This results in average computing time $O(n^{1+\epsilon})$ for $\epsilon > 0$ arbitrarily small, which may be considered as "surprisingly good" average time complexity for solving an NP-hard problem. Moreover, we prove the following tail bound on the distribution of computation time: hard instances, forcing the algorithm to compute in time $2^{\Omega(n)}$, occur rarely, with probability tending to zero at the rate $e^{-n\log\log n}$.
翻译:我们考虑Ferson等人的算法(可信赖计算11(3),第207-233页,2005年),该算法旨在解决NP-硬性问题,即用稳健的统计数据来计算间隔数据的最大样本差异(事实上,该配方可以写成为非convex二次方程式,有特定结构)。首先,我们提出一个新的算法版本,将其原始时间绑定为$O(n%2) 2 ⁇ omega)至$O(n\log n+n\cdot 2 ⁇ omega),其中输入数据的数量为NP-硬数据,在某个交叉图中,美元是圆点数。然后,我们将输入数据作为随机变量处理,采用自然的概率数据生成模型。我们得到$oomega=O(n ⁇ 1/log\log\log n}美元,美元=O(log n/\log\log\log) n美元,平均=O美元(rentral) 美元。这个结果平均计算时间为 美元(n_____%%) 美元, 美元 美元) 递定值。