One of the critical biotic stress factors paddy farmers face is diseases caused by bacteria, fungi, and other organisms. These diseases affect plants' health severely and lead to significant crop loss. Most of these diseases can be identified by regularly observing the leaves and stems under expert supervision. In a country with vast agricultural regions and limited crop protection experts, manual identification of paddy diseases is challenging. Thus, to add a solution to this problem, it is necessary to automate the disease identification process and provide easily accessible decision support tools to enable effective crop protection measures. However, the lack of availability of public datasets with detailed disease information limits the practical implementation of accurate disease detection systems. This paper presents \emph{Paddy Doctor}, a visual image dataset for identifying paddy diseases. Our dataset contains 16,225 annotated paddy leaf images across 13 classes (12 diseases and normal leaf). We benchmarked the \emph{Paddy Doctor} dataset using a Convolutional Neural Network (CNN) and four transfer learning based models (VGG16, MobileNet, Xception, and ResNet34). The experimental results showed that ResNet34 achieved the highest F1-score of 97.50%. We release our dataset and reproducible code in the open source for community use.
翻译:稻田农民所面临的关键生物压力因素之一是细菌、真菌和其他有机体引起的疾病。这些疾病严重影响植物健康,导致作物大量损失。这些疾病大多可以通过定期观察叶子和经专家监督的叶子来识别。在一个农业面积辽阔和作物保护专家有限的国家,人工识别稻田疾病是困难的。因此,为了增加解决这一问题的办法,必须使疾病识别过程自动化,并提供易于获取的决策支持工具,以便采取有效的作物保护措施。然而,缺乏具有详细疾病信息的公共数据集限制了准确疾病检测系统的实际实施。本文展示了用于识别稻田疾病的视觉图像数据集。我们的数据集包含16 225个13类(12种疾病和正常叶子)的附带注释的稻田叶图象。我们用革命神经网络(CNN)和四个基于转移的学习模型(VGG16、移动网络、Xception和ResNet34)为基准。实验结果显示,ResNet34实现了我们97世纪最高值的开放源的开放源。