Salient object detection is the task of producing a binary mask for an image that deciphers which pixels belong to the foreground object versus background. We introduce a new salient object detection dataset using images taken by people who are visually impaired who were seeking to better understand their surroundings, which we call VizWiz-SalientObject. Compared to seven existing datasets, VizWiz-SalientObject is the largest (i.e., 32,000 human-annotated images) and contains unique characteristics including a higher prevalence of text in the salient objects (i.e., in 68\% of images) and salient objects that occupy a larger ratio of the images (i.e., on average, $\sim$50\% coverage). We benchmarked seven modern salient object detection methods on our dataset and found they struggle most with images featuring salient objects that are large, have less complex boundaries, and lack text as well as for lower quality images. We invite the broader community to work on our new dataset challenge by publicly sharing the dataset at https://vizwiz.org/tasks-and-datasets/salient-object .
翻译:显性天体探测是制作一个二进制遮罩的任务, 用于解密属于前景对象和背景的像素的图像。 我们推出一个新的显要天体探测数据集, 使用的是那些试图更好地了解周围环境的视力受损者拍摄的图像, 我们称之为VizWiz- SalientObject。 与七个现有数据集相比, VizWiz- SalientObject 是最大的( 即 32 000 个人类附加说明的图像), 包含独特的特性, 包括突出对象( 即68 个图像中的文本) 和占图像更大比例的突出天体( 平均为 $\ sim$50 ) 。 我们测量了七个现代显要物体探测方法, 并发现它们最难于显示显著物体的图像, 这些显要物体很大, 边界不那么复杂, 缺少文字, 以及质量更低的图像。 我们邀请更广泛的社区通过公开分享 https://vizwadata/tasks/tastas。