Inaccurate records of inventory occur frequently, and by some measures cost retailers approximately 4% in annual sales. Detecting inventory inaccuracies manually is cost-prohibitive, and existing algorithmic solutions rely almost exclusively on learning from longitudinal data, which is insufficient in the dynamic environment induced by modern retail operations. Instead, we propose a solution based on cross-sectional data over stores and SKUs, observing that detecting inventory inaccuracies can be viewed as a problem of identifying anomalies in a (low-rank) Poisson matrix. State-of-the-art approaches to anomaly detection in low-rank matrices apparently fall short. Specifically, from a theoretical perspective, recovery guarantees for these approaches require that non-anomalous entries be observed with vanishingly small noise (which is not the case in our problem, and indeed in many applications). So motivated, we propose a conceptually simple entry-wise approach to anomaly detection in low-rank Poisson matrices. Our approach accommodates a general class of probabilistic anomaly models. We show that the cost incurred by our algorithm approaches that of an optimal algorithm at a min-max optimal rate. Using synthetic data and real data from a consumer goods retailer, we show that our approach provides up to a 10x cost reduction over incumbent approaches to anomaly detection. Along the way, we build on recent work that seeks entry-wise error guarantees for matrix completion, establishing such guarantees for sub-exponential matrices, a result of independent interest.
翻译:库存记录不准确的情况经常发生,某些措施导致零售商每年销售成本约为4%。 人工检测库存不准确的情况显然是成本刺激性的,而现有的算法解决方案几乎完全依赖于从纵向数据中学习,而这种数据在现代零售业务所引发的动态环境中是不够的。相反,我们提议基于商店和SKUs的跨部门数据的解决办法,指出发现库存不准确的情况可被视为在(低级)普瓦森矩阵中发现异常现象的一个问题。在低级基盘中发现异常现象的国家先进方法显然不尽人意。我们从理论角度出发,这些方法的回收保证要求以消失的小噪音(我们的问题不是这种情况,而是许多应用程序中的情况 ) 来观察非异常的条目。 因此,我们提出在概念上简单的入门方法来发现低级Poisson矩阵中的异常现象。 我们的方法适应了一种一般的可比较性异常模型。 我们展示了我们从最优化的算法方法中产生的成本,即从最优的检测方法到最精确的排序方法,我们从一个最精确的精确的检验方法,从一个最精确的检验了10年的消费者进入率的数据,我们的一个合成数据,从一个最精确的计算方法,到一个压的精确的计算,从一个压压压低的计算,到一个压压压压压的计算方法,从一个10的计算,从一个压压压低的计算方法,到一个压压压压在10的计算。