We study the problem of identifying anomalies in a low-rank matrix observed with sub-exponential noise, motivated by applications in retail and inventory management. State of the art approaches to anomaly detection in low-rank matrices apparently fall short, since they require that non-anomalous entries be observed with vanishingly small noise (which is not the case in our problem, and indeed in many applications). So motivated, we propose a conceptually simple entrywise approach to anomaly detection in low-rank matrices. Our approach accommodates a general class of probabilistic anomaly models. We extend recent work on entrywise error guarantees for matrix completion, establishing such guarantees for sub-exponential matrices, where in addition to missing entries, a fraction of entries are corrupted by (an also unknown) anomaly model. Viewing the anomaly detection as a classification task, to the best of our knowledge, we are the first to achieve the min-max optimal detection rate (up to log factors). Using data from a massive consumer goods retailer, we show that our approach provides significant improvements over incumbent approaches to anomaly detection.
翻译:我们研究了在零售和库存管理应用中发现低级别矩阵中的异常现象的问题。在零售和库存管理应用中,发现低级别矩阵中的异常现象的先进方法显然不尽人意,因为这些方法要求用消失的小噪音(在我们的问题中不是这种情况,实际上在许多应用中也是这样)来观察非异常的条目。因此,我们提出了一个概念上简单的入门方法来发现低级别矩阵中的异常现象。我们的方法包括了一般的概率异常模型。我们扩展了最近为完成矩阵的入门错误保证工作,为次富裕矩阵建立了这种保证,除了缺失的条目之外,一部分条目被(同样未知的)异常模型腐蚀了。根据我们的知识,将异常现象检测视为一种分类任务,我们是第一个达到微量最大最佳检测率(直至日志因素 ) 。我们利用一个大型消费品零售商的数据,表明我们的方法为异常现象检测提供了显著改进。