零截断泊松回归用于受虚假零值干扰的稀疏多维计数数据 (Zero-Truncated Poisson Regression for Sparse Multiway Count Data Corrupted by False Zeros) - 专知论文

会员服务 ·

0

低秩 · 稀疏 · 低秩张量 · 多参数 · 参数空间 ·

2023 年 4 月 12 日

Zero-Truncated Poisson Regression for Sparse Multiway Count Data Corrupted by False Zeros

翻译：零截断泊松回归用于受虚假零值干扰的稀疏多维计数数据

Oscar López,Daniel M. Dunlavy,Richard B. Lehoucq

from arxiv, 30 pages, 5 figures

We propose a novel statistical inference methodology for multiway count data that is corrupted by false zeros that are indistinguishable from true zero counts. Our approach consists of zero-truncating the Poisson distribution to neglect all zero values. This simple truncated approach dispenses with the need to distinguish between true and false zero counts and reduces the amount of data to be processed. Inference is accomplished via tensor completion that imposes low-rank tensor structure on the Poisson parameter space. Our main result shows that an $N$-way rank-$R$ parametric tensor $\boldsymbol{\mathscr{M}}\in(0,\infty)^{I\times \cdots\times I}$ generating Poisson observations can be accurately estimated by zero-truncated Poisson regression from approximately $IR^2\log_2^2(I)$ non-zero counts under the nonnegative canonical polyadic decomposition. Our result also quantifies the error made by zero-truncating the Poisson distribution when the parameter is uniformly bounded from below. Therefore, under a low-rank multiparameter model, we propose an implementable approach guaranteed to achieve accurate regression in under-determined scenarios with substantial corruption by false zeros. Several numerical experiments are presented to explore the theoretical results.

翻译：我们提出了一种新的多维计数数据的统计推断方法，该数据被虚假的零计数干扰，这些计数是与真实的零计数无法区分的。我们的方法是对泊松分布进行零截断，舍去所有零值。这种简单的截断方法省去了区分真实零计数和虚假零计数的必要性，并减少了要处理的数据量。通过在泊松参数空间上施加低秩张量结构，通过张量完成，来实现推理。我们的主要结果表明，使用非负正交分解，从约为$IR^2\log_2^2(I)$个非零计数中，可以准确地估计生成泊松观测值的$N$阶秩为R的参数张量$\boldsymbol{\mathscr{M}}\in(0,\infty)^{I\times \cdots\times I}$。当参数从下界一致有界时，我们的结果还量化了零截断泊松分布的误差。因此，在低秩多参数模型下，我们提出了一种可以在受虚假零值干扰的欠定情况下实现准确回归的可实施方法。我们还展示了几个数值实验来探索理论结果。

0

相关内容

【ICCV2021】参数化对比学习

专知会员服务

33+阅读 · 2021年7月27日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

431+阅读 · 2021年1月11日

哈佛大学Hernan教授《因果推断:What If》新书，311页讲解因果效应（附下载）

哈佛大学Hernan教授《因果推断:What If》新书，311页讲解因果效应（附下载）

专知会员服务

166+阅读 · 2021年1月7日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

277+阅读 · 2019年10月25日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TensorFlow 2.0新特性之Ragged Tensor

TensorFlow 2.0新特性之Ragged Tensor

深度学习每日摘要

18+阅读 · 2019年4月5日

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

专知

21+阅读 · 2019年3月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

数据分析师应该知道的16种回归方法：泊松回归

数据分析师应该知道的16种回归方法：泊松回归

数萃大数据

35+阅读 · 2018年9月13日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于线性贝叶斯MAP估计和稀疏表达模型的图像插值算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面板数据分位数回归中的模型选择问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

删失数据中位数回归模型的统计分析

国家自然科学基金

3+阅读 · 2012年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

复杂数据下联合均值与方差模型的统计推断

国家自然科学基金

1+阅读 · 2012年12月31日

区间删失数据的半参数回归模型的有效估计方法

国家自然科学基金

0+阅读 · 2012年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

一类有限混合半参数时间序列模型的研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

Arxiv

0+阅读 · 2023年5月30日

How to Staff When Customers Arrive in Batches

Arxiv

0+阅读 · 2023年5月29日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

1+阅读 · 2023年5月29日

MMD Aggregated Two-Sample Test

Arxiv

0+阅读 · 2023年5月29日

Exhaustive Symbolic Regression

Arxiv

0+阅读 · 2023年5月29日

Counterfactual Formulation of Patient-Specific Root Causes of Disease

Arxiv

0+阅读 · 2023年5月27日

Local Convergence of Gradient Methods for Min-Max Games under Partial Curvature

Arxiv

0+阅读 · 2023年5月26日

Feature Adaptation for Sparse Linear Regression

Arxiv

0+阅读 · 2023年5月26日

Unsupervised Melody-Guided Lyrics Generation

Arxiv

0+阅读 · 2023年5月26日

Regression of binary network data with exchangeable latent errors

Arxiv

0+阅读 · 2023年5月25日

VIP会员

文章信息

相关主题

相关VIP内容

【ICCV2021】参数化对比学习

专知会员服务

33+阅读 · 2021年7月27日

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

MIT经典《线性代数》，584页pdf，Introduction to Linear Algebra, Fifth Edition, Gilbert Strang, 2016.

专知会员服务

431+阅读 · 2021年1月11日

哈佛大学Hernan教授《因果推断:What If》新书，311页讲解因果效应（附下载）

哈佛大学Hernan教授《因果推断:What If》新书，311页讲解因果效应（附下载）

专知会员服务

166+阅读 · 2021年1月7日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

经典书《机器学习：概率视角》（Machine Learning: a Probabilistic Perspective）第二版Python代码，附1098页pdf下载

专知会员服务

277+阅读 · 2019年10月25日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

TensorFlow 2.0新特性之Ragged Tensor

TensorFlow 2.0新特性之Ragged Tensor

深度学习每日摘要

18+阅读 · 2019年4月5日

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

用一行tf.data实现数据Shuffle、Batch划分、异步预加载等

专知

21+阅读 · 2019年3月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

数据分析师应该知道的16种回归方法：泊松回归

数据分析师应该知道的16种回归方法：泊松回归

数萃大数据

35+阅读 · 2018年9月13日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

Arxiv

0+阅读 · 2023年5月30日

How to Staff When Customers Arrive in Batches

Arxiv

0+阅读 · 2023年5月29日

Performance of Empirical Risk Minimization for Linear Regression with Dependent Data

Arxiv

1+阅读 · 2023年5月29日

MMD Aggregated Two-Sample Test

Arxiv

0+阅读 · 2023年5月29日

Exhaustive Symbolic Regression

Arxiv

0+阅读 · 2023年5月29日

Counterfactual Formulation of Patient-Specific Root Causes of Disease

Arxiv

0+阅读 · 2023年5月27日

Local Convergence of Gradient Methods for Min-Max Games under Partial Curvature

Arxiv

0+阅读 · 2023年5月26日

Feature Adaptation for Sparse Linear Regression

Arxiv

0+阅读 · 2023年5月26日

Unsupervised Melody-Guided Lyrics Generation

Arxiv

0+阅读 · 2023年5月26日

Regression of binary network data with exchangeable latent errors

Arxiv

0+阅读 · 2023年5月25日

相关基金

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于线性贝叶斯MAP估计和稀疏表达模型的图像插值算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

面板数据分位数回归中的模型选择问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

不完全数据下分位数回归模型的经验似然推断

国家自然科学基金

1+阅读 · 2013年12月31日

删失数据中位数回归模型的统计分析

国家自然科学基金

3+阅读 · 2012年12月31日

统计学习理论中的分位数回归和MEE算法

国家自然科学基金

1+阅读 · 2012年12月31日

复杂数据下联合均值与方差模型的统计推断

国家自然科学基金

1+阅读 · 2012年12月31日

区间删失数据的半参数回归模型的有效估计方法

国家自然科学基金

0+阅读 · 2012年12月31日

用多重假设检验方法来研究方差变点问题

国家自然科学基金

0+阅读 · 2009年12月31日

一类有限混合半参数时间序列模型的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员