The alfalfa crop is globally important as livestock feed, so highly efficient planting and harvesting could benefit many industries, especially as the global climate changes and traditional methods become less accurate. Recent work using machine learning (ML) to predict yields for alfalfa and other crops has shown promise. Previous efforts used remote sensing, weather, planting, and soil data to train machine learning models for yield prediction. However, while remote sensing works well, the models require large amounts of data and cannot make predictions until the harvesting season begins. Using weather and planting data from alfalfa variety trials in Kentucky and Georgia, our previous work compared feature selection techniques to find the best technique and best feature set. In this work, we trained a variety of machine learning models, using cross validation for hyperparameter optimization, to predict biomass yields, and we showed better accuracy than similar work that employed more complex techniques. Our best individual model was a random forest with a mean absolute error of 0.081 tons/acre and R{$^2$} of 0.941. Next, we expanded this dataset to include Wisconsin and Mississippi, and we repeated our experiments, obtaining a higher best R{$^2$} of 0.982 with a regression tree. We then isolated our testing datasets by state to explore this problem's eligibility for domain adaptation (DA), as we trained on multiple source states and tested on one target state. This Trivial DA (TDA) approach leaves plenty of room for improvement through exploring more complex DA techniques in forthcoming work.
翻译:阿尔法尔法作物作为牲畜饲料具有全球重要性,因此高效的种植和收获可以使许多行业受益,特别是因为全球气候变化和传统方法越来越不准确。最近利用机器学习(ML)来预测阿尔法法和其他作物的产量的工作显示了希望。以前的努力利用遥感、天气、种植和土壤数据来训练机器学习模型,以进行产量预测。虽然遥感工作良好,但模型需要大量数据,在收获季节开始之前无法作出预测。利用天气和种植来自肯塔基和格鲁吉亚的复杂技术试验的数据,我们以前的工作与地物选择技术比较,以找到最佳技术和最佳地物集。在这项工作中,我们培训了各种机器学习模型,使用超光量量优化的交叉验证,预测生物量产量,我们展示的准确性比使用更复杂的技术的类似工作要好。我们的最佳个人模型是随机森林,其绝对误差为0.0081吨/英亩和R{%2美元/美元。随后,我们把这一数据集扩大到美斯康辛辛基和密西西西,我们重复了我们的实验,以更高的目标,获得了更高的R${2} 以测试我们所培训的轨道数据。