Distribution-as-response regression problems are gaining wider attention, especially within biomedical settings where observation-rich patient specific data sets are available, such as feature densities in CT scans (Petersen et al., 2021) actigraphy (Ghosal et al., 2023), and continuous glucose monitoring (Coulter et al., 2024; Matabuena et al., 2021). To accommodate the complex structure of such problems, Petersen and M\"uller (2019) proposed a regression framework called Fr\'echet regression which allows non-Euclidean responses, including distributional responses. This regression framework was further extended for variable selection by Tucker et al. (2023), and Coulter et al. (2024) (arXiv:2403.00922 [stat.AP]) developed a fast variable selection algorithm for the specific setting of univariate distributional responses equipped with the 2-Wasserstein metric (2-Wasserstein space). We present "fastfrechet", an R package providing fast implementation of these Fr\'echet regression and variable selection methods in 2-Wasserstein space, with resampling tools for automatic variable selection. "fastfrechet" makes distribution-based Fr\'echet regression with resampling-supplemented variable selection readily available and highly scalable to large data sets, such as the UK Biobank (Doherty et al., 2017).
翻译:暂无翻译