WILDS: 网上分销变更基准 (WILDS: A Benchmark of in-the-Wild Distribution Shifts)

Pang Wei Koh,Shiori Sagawa,Henrik Marklund,Sang Michael Xie,Marvin Zhang,Akshay Balsubramani,Weihua Hu,Michihiro Yasunaga,Richard Lanas Phillips,Irena Gao,Tony Lee,Etienne David,Ian Stavness,Wei Guo,Berton A. Earnshaw,Imran S. Haque,Sara Beery,Jure Leskovec,Anshul Kundaje,Emma Pierson,Sergey Levine,Chelsea Finn,Percy Liang

Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild. Despite their ubiquity, these real-world distribution shifts are under-represented in the datasets widely used in the ML community today. To address this gap, we present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts which naturally arise in real-world applications, such as shifts across hospitals for tumor identification; across camera traps for wildlife monitoring; and across time and location in satellite imaging and poverty mapping. On each dataset, we show that standard training results in substantially lower out-of-distribution than in-distribution performance, and that this gap remains even with models trained by existing methods for handling distribution shifts. This underscores the need for new training methods that produce models which are more robust to the types of distribution shifts that arise in practice. To facilitate method development, we provide an open-source package that automates dataset loading, contains default model architectures and hyperparameters, and standardizes evaluations. Code and leaderboards are available at https://wilds.stanford.edu.

翻译：分布变化 -- -- 培训分布与测试分布不同 -- -- 可能大大降低在野外部署的机器学习系统(ML)的准确性。尽管这些真实世界分布变化普遍存在,但这些变化在当今ML社区广泛使用的数据集中的代表性不足。为了解决这一差距,我们介绍了世界综合发展系统,它汇集了8个基准数据集,反映了在现实世界应用中自然产生的分布变化的多样性,这些变化包括:在医院之间转移肿瘤识别;在野生生物监测的相机陷阱之间;以及卫星成像和贫困绘图的时间和地点之间。我们在每个数据集中都显示,标准培训的结果是分配之外远远低于分配中的绩效,而这种差距甚至与通过现有分配变化处理方法培训的模式仍然存在差距。这突出表明,需要采用新的培训方法来生成模型,这些模型对实际中出现的分布变化类型更为强大。为了便利方法的开发,我们提供了一个开放源软件包,使数据集自动装装,含有默认的模型架构和超光谱仪,并使评价标准化。在 https://wildford.stand.

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【伯克利】机器学习中充满价值的学科转变（Value-laden Disciplinary Shifts in Machine Learning）

专知会员服务

5+阅读 · 2019年12月5日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日