WebUI:用网络语义加强视觉界面理解的数据集 (WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics) - 专知论文

会员服务 ·

0

可理解性 · WEB · 数据集 · MoDELS · INFORMS ·

2023 年 1 月 30 日

WebUI: A Dataset for Enhancing Visual UI Understanding with Web Semantics

翻译：WebUI:用网络语义加强视觉界面理解的数据集

Jason Wu,Siyan Wang,Siman Shen,Yi-Hao Peng,Jeffrey Nichols,Jeffrey P. Bigham

from arxiv, Accepted to CHI 2023. Dataset, code, and models release coming soon

Modeling user interfaces (UIs) from visual information allows systems to make inferences about the functionality and semantics needed to support use cases in accessibility, app automation, and testing. Current datasets for training machine learning models are limited in size due to the costly and time-consuming process of manually collecting and annotating UIs. We crawled the web to construct WebUI, a large dataset of 400,000 rendered web pages associated with automatically extracted metadata. We analyze the composition of WebUI and show that while automatically extracted data is noisy, most examples meet basic criteria for visual UI modeling. We applied several strategies for incorporating semantics found in web pages to increase the performance of visual UI understanding models in the mobile domain, where less labeled data is available: (i) element detection, (ii) screen classification and (iii) screen similarity.

翻译：从视觉信息建模用户界面(UIs)使各系统能够对支持无障碍、应用自动化和测试方面使用案例所需的功能和语义进行推断。目前用于培训机器学习模型的数据集规模有限,因为人工收集和说明UIs的过程耗时费时费钱。我们爬过网络,以构建与自动提取元数据有关的400,000个大数据集WebUI。我们分析了WebUI的构成,并表明自动提取的数据虽然吵闹,但大多数例子都符合视觉界面建模的基本标准。我们采用了若干战略,将网页上发现的语义纳入网页,以提高移动领域视觉界面理解模型的性能,在移动领域有较少标签的数据:(一) 元素检测,(二) 屏幕分类和(三) 屏幕相似性。

0

相关内容

可理解性

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

二硫化钼/贵金属纳米复合材料的制备及其在肿瘤多模式治疗中的应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

碳量子点纳米复合材料在肿瘤多模态成像和光疗中的应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

抗癌药物Sorafenib下调AIB1的分子机制及其生物学效应

国家自然科学基金

0+阅读 · 2013年12月31日

Beclin 1在阿尔茨海默病样神经元损伤中的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多功能稀土上转换纳米晶/铂(IV)纳米药物载体的研制及生物医学应用

国家自然科学基金

0+阅读 · 2012年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型有机-无机多孔SiO2杂化发光材料的制备与性能测试

国家自然科学基金

0+阅读 · 2012年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

颗粒增强铝基复合材料微结构与热传导关联研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

Arxiv

0+阅读 · 2023年3月22日

Correlational Image Modeling for Self-Supervised Visual Pre-Training

Arxiv

0+阅读 · 2023年3月22日

On Domain-Specific Pre-Training for Effective Semantic Perception in Agricultural Robotics

Arxiv

0+阅读 · 2023年3月22日

Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions

Arxiv

0+阅读 · 2023年3月22日

MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense Retrieval

Arxiv

0+阅读 · 2023年3月21日

Collecting Interactive Multi-modal Datasets for Grounded Language Understanding

Arxiv

0+阅读 · 2023年3月21日

eP-ALM: Efficient Perceptual Augmentation of Language Models

Arxiv

0+阅读 · 2023年3月20日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

VIP会员

文章信息

相关主题

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

Arxiv

0+阅读 · 2023年3月22日

Correlational Image Modeling for Self-Supervised Visual Pre-Training

Arxiv

0+阅读 · 2023年3月22日

On Domain-Specific Pre-Training for Effective Semantic Perception in Agricultural Robotics

Arxiv

0+阅读 · 2023年3月22日

Label-Efficient Deep Learning in Medical Image Analysis: Challenges and Future Directions

Arxiv

0+阅读 · 2023年3月22日

MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense Retrieval

Arxiv

0+阅读 · 2023年3月21日

Collecting Interactive Multi-modal Datasets for Grounded Language Understanding

Arxiv

0+阅读 · 2023年3月21日

eP-ALM: Efficient Perceptual Augmentation of Language Models

Arxiv

0+阅读 · 2023年3月20日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

相关基金

二硫化钼/贵金属纳米复合材料的制备及其在肿瘤多模式治疗中的应用研究

国家自然科学基金

0+阅读 · 2015年12月31日

碳量子点纳米复合材料在肿瘤多模态成像和光疗中的应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

抗癌药物Sorafenib下调AIB1的分子机制及其生物学效应

国家自然科学基金

0+阅读 · 2013年12月31日

Beclin 1在阿尔茨海默病样神经元损伤中的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

多功能稀土上转换纳米晶/铂(IV)纳米药物载体的研制及生物医学应用

国家自然科学基金

0+阅读 · 2012年12月31日

BRCA1蛋白出核的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型有机-无机多孔SiO2杂化发光材料的制备与性能测试

国家自然科学基金

0+阅读 · 2012年12月31日

液相还原法制备Heusler合金纳米颗粒及其结构和性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

颗粒增强铝基复合材料微结构与热传导关联研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员