Compositional data arise in many real-life applications and versatile methods for properly analyzing this type of data in the regression context are needed. This paper, through use of the $\alpha$-transformation, extends the classical $k$-$NN$ regression to what is termed $\alpha$-$k$-$NN$ regression, yielding a highly flexible non-parametric regression model for compositional data. Unlike many of the recommended regression models for compositional data, zeros values (which commonly occur in practice) are not problematic and they can be incorporated into the proposed model without modification. Extensive simulation studies and real-life data analyses highlight the advantage of using $\alpha$-$k$-$NN$ regression for complex relationships between the compositional response data and Euclidean predictor variables. Both suggest that $\alpha$-$k$-$NN$ regression can lead to more accurate predictions compared to current regression models which assume a, sometimes restrictive, parametric relationship with the predictor variables. In addition, the proposed regression enjoys a high computational efficiency rendering it highly attractive for use with large scale, massive, or big data.
翻译:许多实际应用和在回归情况下适当分析这类数据的多种方法都产生了构成数据。本文件通过使用alpha$转换法,将古典美元-NN$回归法延伸至所谓的alpha美元-k美元-NN美元回归法,为构成数据生成了一个非常灵活的非参数回归模型。与许多建议的构成数据回归模型不同,零值(通常在实践中发生)并不成问题,可以不作修改地纳入拟议的模型。广泛的模拟研究和实际数据分析强调使用$alpha美元-k$-NNN美元回归法对于构成响应数据和Euclidean预测变量之间的复杂关系的好处。两者都表明,与目前的回归模型相比,零值(通常在实践中发生)可以导致更准确的预测,因为目前的回归模型假定了与预测或变量之间的一种有时限制性的、准值关系。此外,拟议的回归法具有很高的计算效率,因此对于大规模数据或大数据的使用具有很高的吸引力。