Probability density estimation from observed data constitutes a central task in statistics. Recent advancements in machine learning offer new tools but also pose new challenges. The big data era demands analysis of long-range spatial and long-term temporal dependencies in large collections of raw data, rendering neural networks an attractive solution for density estimation. In this paper, we exploit the concept of copula to explicitly build an estimate of the probability density function associated to any observed data. In particular, we separate univariate marginal distributions from the joint dependence structure in the data, the copula itself, and we model the latter with a neural network-based method referred to as copula density neural estimation (CODINE). Results show that the novel learning approach is capable of modeling complex distributions and it can be applied for mutual information estimation and data generation.
翻译:从观察到的数据中测得的概率密度估计是统计的一项核心任务。最近机器学习的进步提供了新的工具,但也带来了新的挑战。大数据时代要求分析大量原始数据收集中的远距离空间和长期时间依赖性,使神经网络成为对密度估计的有吸引力的解决方案。在本文中,我们利用千叶概念明确估算与任何观察到的数据相关的概率密度函数。特别是,我们将单象值边缘分布与数据中联合依赖结构、千叶本身和我们以神经网络为基础的方法(称为合金密度神经估计(CODINE))进行模拟。结果显示,新颖的学习方法能够建模复杂的分布,并可用于相互的信息估计和数据生成。