A Markov tree is a probabilistic graphical model for a random vector indexed by the nodes of an undirected tree encoding conditional independence relations between variables. One possible limit distribution of partial maxima of samples from such a Markov tree is a max-stable H\"usler-Reiss distribution whose parameter matrix inherits its structure from the tree, each edge contributing one free dependence parameter. Our central assumption is that, upon marginal standardization, the data-generating distribution is in the max-domain of attraction of the said H\"usler-Reiss distribution, an assumption much weaker than the one that data are generated according to a graphical model. Even if some of the variables are unobservable (latent), we show that the underlying model parameters are still identifiable if and only if every node corresponding to a latent variable has degree at least three. Three estimation procedures, based on the method of moments, maximum composite likelihood, and pairwise extremal coefficients, are proposed for usage on multivariate peaks over thresholds data when some variables are latent. A typical application is a river network in the form of a tree where, on some locations, no data are available. We illustrate the model and the identifiability criterion on a data set of high water levels on the Seine, France, with two latent variables. The structured H\"usler-Reiss distribution is found to fit the observed extremal dependence patterns well. The parameters being identifiable we are able to quantify tail dependence between locations for which there are no data.
翻译:Markov 树是一个随机矢量的概率图形模型, 由未定向树编码的树的节点为该矢量索引, 变量之间有条件的独立关系 。 从此Markov 树的样本中部分最大值的一个可能的有限分布是 最大稳定 H\ ” “ usler-Reiss ” 参数矩阵的分布, 其参数矩阵从树上继承其结构, 每个边缘贡献一个自由依赖参数。 我们的核心假设是, 在边际标准化时, 数据生成的分布位于上述 H\ “ usler- Reiss 分布” 的吸引力的最大范围, 这一假设比根据图形参数生成的数据要弱得多。 即使部分变量的局部最大最大值分布是不可观测的 H\ “ usler- Reiss ” 分布, 我们显示, 基本模型的模型仍然可以识别, 某些结构变量的位置是 。 根据时间方法、 最大复合可能性和 极差系数, 提出了三种估算程序, 在某些变量具有潜值的情况下, 在多变量上使用多变量的顶点上, 。 一个典型的河流网络, 其形式是可测量的底值为 。