The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite the increased access to earth observation data for agriculture, there is a scarcity of curated, labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset, SICKLE, having time-series images at different spatial resolutions from 3 different satellites, annotated with multiple key cropping parameters for paddy cultivation for the Cauvery Delta region in Tamil Nadu, India. The dataset comprises of 2,398 season-wise samples from 388 unique plots distributed across 4 districts of the Delta. The dataset covers multi-spectral, thermal and microwave data between the time period January 2018-March 2021. The paddy samples are annotated with 4 key cropping parameters, i.e. sowing date, transplanting date, harvesting date and crop yield. This is one of the first studies to consider the growing season (using sowing and harvesting dates) as part of a dataset. We also propose a yield prediction strategy that uses time-series data generated based on the observed growing season and the standard seasonal information obtained from Tamil Nadu Agricultural University for the region. The consequent performance improvement highlights the impact of ML techniques that leverage domain knowledge that are consistent with standard practices followed by farmers in a specific region. We benchmark the dataset on 3 separate tasks, namely crop type, phenology date (sowing, transplanting, harvesting) and yield prediction, and develop an end-to-end framework for predicting key crop parameters in a real-world setting.
翻译:机器学习(ML)模型的可用性强,推动了机器学习(ML)模型的成功。尽管获得农业地球观测数据的机会有所增加,但缺少经过整理的贴标签的数据集,这限制了其在农业遥感(RS)模型培训中使用ML数据集的可能性。为此,我们引入了一组原始数据集,即SICKLE,在3颗不同卫星的不同空间分辨率上拥有时间序列图像,附加了印度泰米尔纳德邦Cauvery Delta地区水稻种植的多个关键作物生长参数。数据集由分布在德尔达4个区的388个独特参数范围内的2 398个季节性季节性样本组成。数据集覆盖了2018年1月至2021年3月期间用于培训ML遥感(RS)模型用于培训多光谱、热和微波数据。谷地样本附有4个关键作物生长参数,即播种日期、移植日期、收割日期和作物产量。这是第一次研究,旨在将生长季节(使用播种和收获日期)框架的生长季节性样本样本样本样本样本样本样本样本样本样本样本样本组成。我们还提议了一项产出预测战略,用于不断提高具体年份的谷底地区,根据具体年份数据周期数据分析,用于不断增长数据采集数据采集数据,根据时间序列生成数据采集,根据具体数据采集数据采集数据采集数据采集数据采集数据采集数据采集数据采集数据。