We provide algorithms for isotonic regression minimizing $L_0$ error (Hamming distance). This is also known as monotonic relabeling, and is applicable when labels have a linear ordering but not necessarily a metric. There may be exponentially many optimal relabelings, so we look at secondary criteria to determine which are best. For arbitrary ordinal labels the criterion is maximizing the number of labels which are only changed to an adjacent label (and recursively apply this). For real-valued labels we minimize the $L_p$ error. For linearly ordered sets we also give algorithms which minimize the sum of the $L_p$ and weighted $L_0$ errors, a form of penalized (regularized) regression. We also examine $L_0$ isotonic regression on multidimensional coordinate-wise orderings. Previous algorithms took $\Theta(n^3)$ time, but we reduce this to $o(n^{3/2})$.
翻译:我们提供等离子回归算法, 最小化 $L_ 0$ 错误( 危险距离 ) 。 这也被称为单调重标签, 当标签有线性顺序时适用, 但不一定是一公吨。 可能有指数性的最佳重标签, 因此我们查看二级标准以确定哪一种最佳。 对于任意的交点标签, 标准是将标签数量最大化, 这些标签仅更改为相邻标签( 并循环应用此值 ) 。 对于实际价值的标签, 我们最小化 $L_ p$ 错误。 对于线性订购的套件, 我们还提供将 $L_ p$ 和 加权 $L_ 0$ 错误之和最小化的算法, 这是一种受处罚( 常规化) 回归形式。 我们还检查多维协调排序上的 $L_ 0$ 等离子回归。 以前的算法花费了 $\ Theta( n%3) 时间, 但是我们将它降为 $( n_ 3/2} 。