Conditional density estimation (CDE) is the task of estimating the probability of an event conditioned on some inputs. A neural network (NN) can also be used to compute the output distribution for continuous-domain, which can be viewed as an extension of regression task. Nevertheless, it is difficult to explicitly approximate a distribution without knowing the information of its general form a priori. In order to fit an arbitrary conditional distribution, discretizing the continuous domain into bins is an effective strategy, as long as we have sufficiently narrow bins and very large data. However, collecting enough data is often hard to reach and falls far short of that ideal in many circumstances, especially in multivariate CDE for the curse of dimensionality. In this paper, we demonstrate the benefits of modeling free-form conditional distributions using a deconvolution-based neural net framework, coping with data deficiency problems in discretization. It has the advantage of being flexible but also takes advantage of the hierarchical smoothness offered by the deconvolution layers. We compare our method to a number of other density-estimation approaches and show that our Deconvolutional Density Network (DDN) outperforms the competing methods on many univariate and multivariate tasks. The code of DDN is available at https://github.com/NBICLAB/DDN.
翻译:有条件密度估计(CDE)是估计某个事件以某些投入为条件的概率的任务。神经网络(NN)也可以用来计算连续域的输出分布,这可以被视为回归任务的延伸。然而,如果不了解其一般形式的信息,很难明确估计其分布,这是先验的。为了适应任意的有条件分布,将连续域分解成垃圾桶是一项有效的战略,只要我们有足够的狭窄的垃圾桶和非常大的数据。然而,在很多情况下,收集足够的数据往往难以达到,而且远远达不到这一理想,特别是在多种变异的CDE对维度的诅咒中。在本文中,我们展示了使用基于分流的神经网框架来模拟自由成型有条件分布的好处,以应对离散的数据缺陷问题。它的优点是灵活,但也利用了分解层提供的等级平稳。我们的方法与其他密度估计方法进行了比较,并表明我们的进化数据库/多变量网络(DDDDDDDDA/CRA)的不具有竞争力。