In online applications with streaming data, awareness of how far the training or test set has shifted away from the original dataset can be crucial to the performance of the model. However, we may not have access to historical samples in the data stream. To cope with such situations, we propose a novel method, Continual Density Ratio Estimation (CDRE), for estimating density ratios between the initial and current distributions ($p/q_t$) of a data stream in an iterative fashion without the need of storing past samples, where $q_t$ is shifting away from $p$ over time $t$. We demonstrate that CDRE can be more accurate than standard DRE in terms of estimating divergences between distributions, despite not requiring samples from the original distribution. CDRE can be applied in scenarios of online learning, such as importance weighted covariate shift, tracing dataset changes for better decision making. In addition, (CDRE) enables the evaluation of generative models under the setting of continual learning. To the best of our knowledge, there is no existing method that can evaluate generative models in continual learning without storing samples from the original distribution.
翻译:在数据流的在线应用中,了解培训或测试组与原始数据集的距离离模型的运行速度可能至关重要。 但是,我们可能无法在数据流中获取历史样本。 为了应对这种情况,我们提议了一种新的方法,即连续密度比率估计(CDRE),用于以迭接方式估计数据流最初和当前分布之间的密度比率(p/q_t美元),而无需储存过去样本,因为美元正逐渐从美元转移出去。我们发现CDRE在估计分布差异方面可能比标准DRE更准确,尽管不需要原始分布的样本。CDRE可用于在线学习的情景,例如重要性加权变换,追踪数据集的变化以更好地决策。此外,(CDRE)在持续学习的环境下,能够评估变基因模型。我们最了解的是,在不储存原始分布的样本的情况下,现有方法无法评价持续学习的基因化模型。