The Poisson multinomial distribution (PMD) describes the distribution of the sum of $n$ independent but non-identically distributed random vectors, in which each random vector is of length $m$ with 0/1 valued elements and only one of its elements can take value 1 with a certain probability. Those probabilities are different for the $m$ elements across the $n$ random vectors, and form an $n \times m$ matrix with row sum equals to 1. We call this $n\times m$ matrix the success probability matrix (SPM). Each SPM uniquely defines a PMD. The PMD is useful in many areas such as, voting theory, ecological inference, and machine learning. The distribution functions of PMD, however, are usually difficult to compute. In this paper, we develop efficient methods to compute the probability mass function (pmf) for the PMD using multivariate Fourier transform, normal approximation, and simulations. We study the accuracy and efficiency of those methods and give recommendations for which methods to use under various scenarios. We also illustrate the use of the PMD via three applications, namely, in voting probability calculation, aggregated data inference, and uncertainty quantification in classification. We build an R package that implements the proposed methods, and illustrate the package with examples.
翻译:Poisson 多元数值分布(PMD) 描述独立但非识别分布的随机矢量的分布,其中每个随机矢量的长度为1美元,值值元素为0/1,只有其中的一个元素可以以一定概率计算值1。对于美元随机矢量的单位,这些概率是不同的。在本文中,我们开发了一个有效的方法来计算PMD的概率质量函数(pmf),使用多种变换、正常近似和模拟等值。我们将这些方法的准确性和效率都称为成功概率矩阵(SPM)。每个SPM 都专门定义了PMD。 PMD在许多领域很有用,例如投票理论、生态推断和机器学习。但是,PMD的分布功能通常难以计算。在本文件中,我们开发了一个有效的方法来计算PMD的概率质量函数(pmf) 。我们研究这些方法的准确性和效率,并给出了在各种情景下使用的方法的建议。我们还用三个例子来说明PMD软件的利用情况:投票理论、生态推算方法,我们用三个分析方法来计算。