In many applications, we collect independent samples from interconnected populations. These population distributions share some latent structure, so it is advantageous to jointly analyze the samples. One effective way to connect the distributions is the semiparametric density ratio model (DRM). A key ingredient in the DRM is that the log density ratios are linear combinations of prespecified functions; the vector formed by these functions is called the basis function. A sensible basis function can often be chosen based on knowledge of the context, and DRM-based inference is effective even if the basis function is imperfect. However, a data-adaptive approach to the choice of basis function remains an interesting and important research problem. We propose an approach based on the classical functional principal component analysis (FPCA). Under some conditions, we show that this approach leads to consistent basis function estimation. Our simulation results show that the proposed adaptive choice leads to an efficiency gain. We use a real-data example to demonstrate the efficiency gain and the ease of our approach.
翻译:在许多应用中,我们从相互关联的人群中收集独立样本。这些人口分布法具有一些潜在的结构,因此有利于共同分析样本。将分布率模型(DRM)连接起来的一个有效途径是半参数密度比率模型(DRM)。DRM中的一个关键要素是,日志密度比率是预先指定函数的线性组合;这些函数形成的矢量称为基函数。根据对上下文的了解,往往可以选择明智的基础功能,即使基础功能不完善,基于DRM的推论也是有效的。然而,数据适应性基函数选择方法仍是一个有趣和重要的研究问题。我们根据经典功能主元件分析(FCCA)提出了一种方法。在某些条件下,我们表明这一方法可以导致一致的功能估计。我们的模拟结果表明,拟议的适应性选择可以带来效率收益。我们用一个真实的数据示例来证明我们方法的效益和容易度。