The question of association between outcome and feature is generally framed in the context of a model on functional and distributional forms. Our motivating application is that of identifying serum biomarkers of angiogenesis, energy metabolism, apoptosis, and inflammation, predictive of recurrence after lung resection in node-negative non-small cell lung cancer patients with tumor stage T2a or less. We propose an omnibus approach for testing association that is free of assumptions on functional forms and distributions and can be used as a black box method. This proposed maximal permutation test is based on the idea of thresholding, is readily implementable and is computationally efficient. We illustrate that the proposed omnibus tests maintain their levels and have strong power as black box tests for detecting linear, nonlinear and quantile-based associations, even with outlier-prone and heavy-tailed error distributions and under nonparametric setting. We additionally illustrate the use of this approach in model-free feature screening and further examine the level and power of these tests for binary outcome. We compare the performance of the proposed omnibus tests with comparator methods in our motivating application to identify preoperative serum biomarkers associated with non-small cell lung cancer recurrence in early stage patients.
翻译:成果和特征之间的联系问题一般是在功能和分布形式模型的背景下设定的。我们的激励应用是确定血管产生、能源新陈代谢、消化和炎症的血清生物标志,预测非细胞肺癌患者在肿瘤阶段T2a或更小的非细胞性肺癌病人肺部切除后会复发。我们提出了一个测试协会的综合方法,该方法不包含对功能形式和分布的假设,可用作黑盒法。这一拟议的最大变异测试基于阈值概念,易于实施,而且具有计算效率。我们说明,拟议的总括测试保持其水平,并具有强大的能量,作为黑盒测试,用于检测线性、非线性、微量基关联,即使其误差分布较易和严重,并且处于非参数性环境之下。我们进一步说明在无模型特征筛选中使用这一方法的情况,并进一步审查这些测试的二元结果的水平和力量。我们比较了拟议综合测试的性能与参照系统应用中的对比方法,以黑盒检测线性、非线性、非线性和微分位性细胞癌症的早期重复性细胞癌症。