In agricultural environments, viewpoint planning can be a critical functionality for a robot with visual sensors to obtain informative observations of objects of interest (e.g., fruits) from complex structures of plant with random occlusions. Although recent studies on active vision have shown some potential for agricultural tasks, each model has been designed and validated on a unique environment that would not easily be replicated for benchmarking novel methods being developed later. In this paper, hence, we introduce a dataset for more extensive research on Domain-inspired Active VISion in Agriculture (DAVIS-Ag). To be specific, we utilized our open-source "AgML" framework and the 3D plant simulator of "Helios" to produce 502K RGB images from 30K dense spatial locations in 632 realistically synthesized orchards of strawberries, tomatoes, and grapes. In addition, useful labels are provided for each image, including (1) bounding boxes and (2) pixel-wise instance segmentations for all identifiable fruits, and also (3) pointers to other images that are reachable by an execution of action so as to simulate the active selection of viewpoint at each time step. Using DAVIS-Ag, we show the motivating examples in which performance of fruit detection for the same plant can significantly vary depending on the position and orientation of camera view primarily due to occlusions by other components such as leaves. Furthermore, we develop several baseline models to showcase the "usage" of data with one of agricultural active vision tasks--fruit search optimization--providing evaluation results against which future studies could benchmark their methodologies. For encouraging relevant research, our dataset is released online to be freely available at: https://github.com/ctyeong/DAVIS-Ag
翻译:在农业环境中,观点规划对于一个具有视觉传感器的机器人来说是一个至关重要的功能,可以对来自随机隔热的工厂的复杂结构中感兴趣的物体(如水果)进行信息观测。虽然最近关于积极视觉的研究显示,农业任务具有一定的潜力,但每个模型都是在一种独特的环境中设计和验证的,这种环境中很难为后来开发的新方法制定基准。在本文中,我们引入了一个数据集,用于对农业中受Domain-启发的积极VISion(DAVIS-Ag)进行更广泛的研究。具体地说,我们利用了我们的开放源“AgML”框架和3D工厂模拟“Helios”的模拟设备,从3K密集的空间地点制作了502K RGB图像,在632个实际合成的草莓、番茄和葡萄等新方法中,每个图像都提供有用的标签,包括(1)捆绑箱和(2)平比方实例分解所有可识别的水果,以及(3)指向其他图像,通过执行一个行动可以达到的其他图像,以便模拟积极定位定位,在每一时间上,我们用一些正向方向的图像的图像的模型展示,我们可以大大地展示,用SVI 展示其定位的图像,我们可以显示它们未来的图像的图像,我们用来测量。</s>