We explore the application of machine learning algorithms to predict the suitability of Russet potato clones for advancement in breeding trials. Leveraging data from manually collected trials in the state of Oregon, we investigate the potential of a wide variety of state-of-the-art binary classification models. We conduct a comprehensive analysis of the dataset that includes preprocessing, feature engineering, and imputation to address missing values. We focus on several key metrics such as accuracy, F1-score, and Matthews correlation coefficient (MCC) for model evaluation. The top-performing models, namely the multi-layer perceptron (MLPC), histogram-based gradient boosting classifier (HGBC), and a support vector machine (SVC), demonstrate consistent and significant results. Variable selection further enhances model performance and identifies influential features in predicting trial outcomes. The findings emphasize the potential of machine learning in streamlining the selection process for potato varieties, offering benefits such as increased efficiency, substantial cost savings, and judicious resource utilization. Our study contributes insights into precision agriculture and showcases the relevance of advanced technologies for informed decision-making in breeding programs.
翻译:暂无翻译