According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, of whom 36 million are blind. Due to their disability, engaging these minority into the society is a challenging problem. The recent rise of smart mobile phones provides a new solution by enabling blind users' convenient access to the information and service for understanding the world. Users with vision impairment can adopt the screen reader embedded in the mobile operating systems to read the content of each screen within the app, and use gestures to interact with the phone. However, the prerequisite of using screen readers is that developers have to add natural-language labels to the image-based components when they are developing the app. Unfortunately, more than 77% apps have issues of missing labels, according to our analysis of 10,408 Android apps. Most of these issues are caused by developers' lack of awareness and knowledge in considering the minority. And even if developers want to add the labels to UI components, they may not come up with concise and clear description as most of them are of no visual issues. To overcome these challenges, we develop a deep-learning based model, called LabelDroid, to automatically predict the labels of image-based buttons by learning from large-scale commercial apps in Google Play. The experimental results show that our model can make accurate predictions and the generated labels are of higher quality than that from real Android developers.
翻译:据世界卫生组织(世卫组织)估计,全球约有13亿人生活在某种视力障碍中,其中3 600万人是盲人。由于他们的残疾,让这些少数民族融入社会是一个具有挑战性的问题。智能移动电话最近兴起,为盲人用户方便地获取信息和服务以了解世界提供了新的解决方案。视力障碍用户可以采用移动操作系统中嵌入的屏幕阅读器阅读应用程序中每个屏幕的内容,并使用手势与手机互动。然而,使用屏幕阅读器的先决条件是,开发者在开发应用程序时,必须在基于图像的组件中添加自然语言标签。不幸的是,77%以上的应用程序缺少标签的问题,根据我们对10 408个机器人应用程序的分析,这些问题大多是由于开发者缺乏认识和知识,无法考虑少数群体。即使开发者想将标签添加到应用程序中的每个屏幕内容中,他们也可能不会找到更简洁和清晰的描述,因为大多数屏幕阅读器都是没有视觉问题的。为了克服这些挑战,我们开发了一种基于深度学习的模型,我们从高层次的游戏质量模型,我们开发了一种名为Label D类模型的模型,可以自动地将一个基于大型图像的模型的模型的模型进行模拟预测。