快速概率变化点探测 (Fast likelihood-based change point detection)

Change point detection plays a fundamental role in many real-world applications, where the goal is to analyze and monitor the behaviour of a data stream. In this paper, we study change detection in binary streams. To this end, we use a likelihood ratio between two models as a measure for indicating change. The first model is a single bernoulli variable while the second model divides the stored data in two segments, and models each segment with its own bernoulli variable. Finding the optimal split can be done in $O(n)$ time, where $n$ is the number of entries since the last change point. This is too expensive for large $n$. To combat this we propose an approximation scheme that yields $(1 - \epsilon)$ approximation in $O(\epsilon^{-1} \log^2 n)$ time. The speed-up consists of several steps: First we reduce the number of possible candidates by adopting a known result from segmentation problems. We then show that for fixed bernoulli parameters we can find the optimal change point in logarithmic time. Finally, we show how to construct a candidate list of size $O(\epsilon^{-1} \log n)$ for model parameters. We demonstrate empirically the approximation quality and the running time of our algorithm, showing that we can gain a significant speed-up with a minimal average loss in optimality.

翻译：更改点检测在许多真实世界应用中起着根本作用, 目标在于分析和监测数据流的行为。在本文中, 我们研究二元流中的改变检测。为此, 我们使用两个模型之间的概率比来测量变化。第一个模型是一个单一的伯努利变量, 而第二个模型将存储的数据分成两个部分, 每个部分的模型都有自己的伯诺利变量。找到最佳的分割可以在 $( n) 时间里完成, 美元是自上次更改点以来的条目数。这对大美元来说太贵了。为了打击这个选项, 我们提议了一个近似方案, 以美元( 1 -\ epsilon) 来计算( $ ( 1 -\ epsilon) 的近似值来表示变化。速度加起来由几个步骤组成 : 首先, 我们通过采用分解问题的已知结果来减少可能的候选人数量。我们然后显示, 对于固定的伯努利参数, 我们可以找到在对正对数值时间里找到最佳的更改点。最后, 我们展示如何用一个最佳的候选人质量列表来显示我们的平均速度。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日