In this paper, we study the identifiability and the estimation of the parameters of a copula-based multivariate model when the margins are unknown and are arbitrary, meaning that they can be continuous, discrete, or mixtures of continuous and discrete. When at least one margin is not continuous, the range of values determining the copula is not the entire unit square and this situation could lead to identifiability issues that are discussed here. Next, we propose estimation methods when the margins are unknown and arbitrary, using pseudo log-likelihood adapted to the case of discontinuities. In view of applications to large data sets, we also propose a pairwise composite pseudo log-likelihood. These methodologies can also be easily modified to cover the case of parametric margins. One of the main theoretical result is an extension to arbitrary distributions of known convergence results of rank-based statistics when the margins are continuous. As a by-product, under smoothness assumptions, we obtain that the asymptotic distribution of the estimation errors of our estimators are Gaussian. Finally, numerical experiments are presented to assess the finite sample performance of the estimators, and the usefulness of the proposed methodologies is illustrated with a copula-based regression model for hydrological data.
翻译:在本文中,我们研究以千叶为基础的多变量模型的可辨识性和参数估计,如果边距未知且具有任意性,则这种模型可以是连续的、离散的或连续和离散的混合物。如果至少一个边距不是连续的,确定千叶的值范围不是整个单位方形,这种情况可能导致此处讨论的可辨识问题。接着,我们提出边距为未知和任意的估算方法,使用适合不连续情况的假对数误差。鉴于对大型数据集的应用,我们还提议一种配对的合成假对数模拟对数相似的对数。这些方法也可以很容易地修改以涵盖准度边距的情况。一个主要的理论结果是,在边距连续的情况下,扩大基于等级的统计已知一致结果的任意分布。作为副产品,在光滑度假设下,我们得到的是,我们估计者估计误差的误差的无孔不测分布是高斯。最后,我们提出数字实验是为了评估模型的定样性性性,而提出一个测量数据的焦差是模拟的。