缩小规模的不准确性 (Fixing Inventory Inaccuracies At Scale)

from arxiv, The preliminary version titled "Near-Optimal Entrywise Anomaly Detection for Low-Rank Matrices with Sub-Exponential Noise" appeared at Proceedings of the 38th International Conference on Machine Learning (ICML 2021)

Inaccurate records of inventory occur frequently, and by some measures cost retailers approximately 4% in annual sales. Detecting inventory inaccuracies manually is cost-prohibitive, and existing algorithmic solutions rely almost exclusively on learning from longitudinal data, which is insufficient in the dynamic environment induced by modern retail operations. Instead, we propose a solution based on cross-sectional data over stores and SKUs, observing that detecting inventory inaccuracies can be viewed as a problem of identifying anomalies in a (low-rank) Poisson matrix. State-of-the-art approaches to anomaly detection in low-rank matrices apparently fall short. Specifically, from a theoretical perspective, recovery guarantees for these approaches require that non-anomalous entries be observed with vanishingly small noise (which is not the case in our problem, and indeed in many applications). So motivated, we propose a conceptually simple entry-wise approach to anomaly detection in low-rank Poisson matrices. Our approach accommodates a general class of probabilistic anomaly models. We show that the cost incurred by our algorithm approaches that of an optimal algorithm at a min-max optimal rate. Using synthetic data and real data from a consumer goods retailer, we show that our approach provides up to a 10x cost reduction over incumbent approaches to anomaly detection. Along the way, we build on recent work that seeks entry-wise error guarantees for matrix completion, establishing such guarantees for sub-exponential matrices, a result of independent interest.

翻译：库存记录不准确的情况经常发生,某些措施导致零售商每年销售成本约为4%。人工检测库存不准确的情况显然是成本刺激性的,而现有的算法解决方案几乎完全依赖于从纵向数据中学习,而这种数据在现代零售业务所引发的动态环境中是不够的。相反,我们提议基于商店和SKUs的跨部门数据的解决办法,指出发现库存不准确的情况可被视为在(低级)普瓦森矩阵中发现异常现象的一个问题。在低级基盘中发现异常现象的国家先进方法显然不尽人意。我们从理论角度出发,这些方法的回收保证要求以消失的小噪音(我们的问题不是这种情况,而是许多应用程序中的情况 ) 来观察非异常的条目。因此,我们提出在概念上简单的入门方法来发现低级Poisson矩阵中的异常现象。我们的方法适应了一种一般的可比较性异常模型。我们展示了我们从最优化的算法方法中产生的成本,即从最优的检测方法到最精确的排序方法,我们从一个最精确的精确的检验方法,从一个最精确的检验了10年的消费者进入率的数据,我们的一个合成数据,从一个最精确的计算方法,到一个压的精确的计算,从一个压压压低的计算,到一个压压压压压的计算方法,从一个10的计算,从一个压压压低的计算方法,到一个压压压压在10的计算。

相关内容

异常检测

关注 102

在数据挖掘中，异常检测（英语：anomaly detection）对不符合预期模式或数据集中其他项目的项目、事件或观测值的识别。通常异常项目会转变成银行欺诈、结构缺陷、医疗问题、文本错误等类型的问题。异常也被称为离群值、新奇、噪声、偏差和例外。特别是在检测滥用与网络入侵时，有趣性对象往往不是罕见对象，但却是超出预料的突发活动。这种模式不遵循通常统计定义中把异常点看作是罕见对象，于是许多异常检测方法（特别是无监督的方法）将对此类数据失效，除非进行了合适的聚集。相反，聚类分析算法可能可以检测出这些模式形成的微聚类。有三大类异常检测方法。[1] 在假设数据集中大多数实例都是正常的前提下，无监督异常检测方法能通过寻找与其他数据最不匹配的实例来检测出未标记测试数据的异常。监督式异常检测方法需要一个已经被标记“正常”与“异常”的数据集，并涉及到训练分类器（与许多其他的统计分类问题的关键区别是异常检测的内在不均衡性）。半监督式异常检测方法根据一个给定的正常训练数据集创建一个表示正常行为的模型，然后检测由学习模型生成的测试实例的可能性。

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日