Accurate representation and localization of relevant objects is important for robots to perform tasks. Building a generic representation that can be used across different environments and tasks is not easy, as the relevant objects vary depending on the environment and the task. Furthermore, another challenge arises in agro-food environments due to their complexity, and high levels of clutter and occlusions. In this paper, we present a method to build generic representations in highly occluded agro-food environments using multi-view perception and 3D multi-object tracking. Our representation is built upon a detection algorithm that generates a partial point cloud for each detected object. The detected objects are then passed to a 3D multi-object tracking algorithm that creates and updates the representation over time. The whole process is performed at a rate of 10 Hz. We evaluated the accuracy of the representation on a real-world agro-food environment, where it was able to successfully represent and locate tomatoes in tomato plants despite a high level of occlusion. We were able to estimate the total count of tomatoes with a maximum error of 5.08% and to track tomatoes with a tracking accuracy up to 71.47%. Additionally, we showed that an evaluation using tracking metrics gives more insight in the errors in localizing and representing the fruits.
翻译:相关对象的准确代表性和本地化对于机器人执行任务很重要。 建立通用代表性对于机器人来说很重要。 建立通用代表性对于不同环境和任务都可使用的不同环境和任务都不容易, 因为相关对象因环境和任务的不同而不同。 此外, 农产食品环境因其复杂性、杂乱和隔离程度高而出现另一个挑战。 在本文中, 我们提出了一个方法, 利用多视和3D多位跟踪仪在高度隐蔽的农产食品环境中建立通用代表性和本地化代表。 我们的代表权建立在为每个被检测到的物体生成部分点云的检测算法之上。 然后, 被检测到的物体被传递到一个3D多对象跟踪算法, 以创造并随时更新其代表性。 整个过程以10赫兹的速度进行。 我们评估了真实世界农产食品环境中代表度的准确性, 在那里,尽管有高度隐蔽性,但仍能够成功地在番茄工厂中代表并定位番茄。 我们得以估算出番茄的总量, 最大误差为5.08%, 并跟踪番茄的准确度, 以跟踪误差为71.47%。 。 我们用更精确度来显示, 将结果显示, 我们用更精确的精确度显示, 。