High-throughput pheno-, geno-, and envirotyping allows routine characterization of plant varieties and the trials they are evaluated in. These datasets can be integrated into statistical models for genomic prediction in several ways. One approach is to create linear or non-linear kernels which are subsequently used in reproducing kernel hilbert spaces (RKHS) regression. Software packages implementing a Bayesian approach are typically used for these RKHS models. However, they often lack some of the flexibility offered by dedicated linear mixed model software such as ASReml-R. Furthermore, a Bayesian approach is often computationally more demanding than a frequentist model. Here we show how frequentist RKHS models can be implemented in ASReml-R and extend these models to allow for heterogeneous (i.e., trial-specific) genetic variances. We also show how an alternative to the typically Bayesian kernel averaging approach can be implemented by treating the bandwidth associated with the non-linear kernel as a parameter to be estimated using restricted maximum likelihood. We show that these REML implementations with homo- or heterogeneous variances perform similarly or better than the Bayesian models. We also show that the REML implementation comes with a significant increase in computational efficiency, being up to 12 times faster than the Bayesian models while using less memory. Finally, we discuss the significant flexibility provided by this approach and the options regarding further customization of variance models.
翻译:暂无翻译