预测建模中令人不解的偏差的统计量化 (Statistical quantification of confounding bias in predictive modelling) - 专知论文

会员服务 ·

0

统计量 · MoDELS · 有偏 · 可交换的 · state-of-the-art ·

2021 年 11 月 1 日

Statistical quantification of confounding bias in predictive modelling

翻译：预测建模中令人不解的偏差的统计量化

from arxiv, 20 pages, 7 figures. The manuscript is associated with the the python package `mlconfound`: https://mlconfound.readthedocs.io See manuscript repository, including fully reproducible analysis code, here: https://github.com/pni-lab/mlconfound-manuscript

The lack of non-parametric statistical tests for confounding bias significantly hampers the development of robust, valid and generalizable predictive models in many fields of research. Here I propose the partial and full confounder tests, which, for a given confounder variable, probe the null hypotheses of unconfounded and fully confounded models, respectively. The tests provide a strict control for Type I errors and high statistical power, even for non-normally and non-linearly dependent predictions, often seen in machine learning. Applying the proposed tests on models trained on functional brain connectivity data from the Human Connectome Project and the Autism Brain Imaging Data Exchange dataset reveals confounders that were previously unreported or found to be hard to correct for with state-of-the-art confound mitigation approaches. The tests, implemented in the package mlconfound (https://mlconfound.readthedocs.io), can aid the assessment and improvement of the generalizability and neurobiological validity of predictive models and, thereby, foster the development of clinically useful machine learning biomarkers.

翻译：缺乏非参数统计测试以弥补偏见,这严重阻碍了在许多研究领域开发稳健、有效和可通用的预测模型。在这里,我提议进行部分和完全的混淆模型测试,分别针对某一混凝土变量,分别探究无根据和完全混乱模型的空虚假设。这些测试严格控制了I型错误和高统计能力,即使是在机器学习中经常看到的非正常和非线性依赖预测。在人类连接项目和自闭症脑成像数据交换数据集中就功能性脑连通数据培训模型进行的拟议测试中应用了拟议测试,显示了以前没有报告过的或发现难以纠正的混乱因素。在Mlconfound软件包(https://mlconfound.readthedocs.io)中实施的测试可以帮助评估和改进预测模型的一般性和神经生物学有效性,从而推动开发临床有用的机器生物识别器。

0

相关内容

统计量

《机器学习思维导图》，一图掌握机器学习知识要点

《机器学习思维导图》，一图掌握机器学习知识要点

专知会员服务

68+阅读 · 2021年1月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

专知会员服务

46+阅读 · 2020年2月11日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

196+阅读 · 2019年12月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

生物探索

3+阅读 · 2018年2月10日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Anomaly DetectionWith Multiple-Hypotheses Predictions

Arxiv

6+阅读 · 2019年1月28日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Causally Regularized Learning with Agnostic Data Selection Bias

Arxiv

6+阅读 · 2018年8月19日

Topic Modelling of Everyday Sexism Project Entries

Arxiv

3+阅读 · 2018年4月5日

Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases

Arxiv

7+阅读 · 2018年3月1日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

《机器学习思维导图》，一图掌握机器学习知识要点

《机器学习思维导图》，一图掌握机器学习知识要点

专知会员服务

68+阅读 · 2021年1月12日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

92+阅读 · 2020年2月12日

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

专知会员服务

46+阅读 · 2020年2月11日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学CS229】面向机器学习的线性代数和微积分要点速览(中文版)《CS 229 - Linear Algebra and Calculus refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

196+阅读 · 2019年12月19日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

851页！《潮涨之海：代数几何的基础》新书

从二维到三维认知：通用世界模型简要综述

航天遥感大模型发展综述与产业化应用展望

WWW 2025 | 基于模式引导的多智能体协同知识抽取框架

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

视觉机械臂 visual-pushing-grasping

视觉机械臂 visual-pushing-grasping

CreateAMind

3+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

已删除

生物探索

3+阅读 · 2018年2月10日

资源｜斯坦福课程：深度学习理论！

资源｜斯坦福课程：深度学习理论！

全球人工智能

17+阅读 · 2017年11月9日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

【今日新增】IEEE Trans.专刊截稿信息8条

【今日新增】IEEE Trans.专刊截稿信息8条

Call4Papers

7+阅读 · 2017年6月29日

相关论文

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

Deep learning: a statistical viewpoint

Arxiv

18+阅读 · 2021年3月16日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Arxiv

11+阅读 · 2021年3月14日

Predicting ConceptNet Path Quality Using Crowdsourced Assessments of Naturalness

Arxiv

3+阅读 · 2019年2月21日

Anomaly DetectionWith Multiple-Hypotheses Predictions

Arxiv

6+阅读 · 2019年1月28日

Automatic Summarization of Natural Language

Arxiv

3+阅读 · 2018年12月18日

Causally Regularized Learning with Agnostic Data Selection Bias

Arxiv

6+阅读 · 2018年8月19日

Topic Modelling of Everyday Sexism Project Entries

Arxiv

3+阅读 · 2018年4月5日

Collaborative Metric Learning Recommendation System: Application to Theatrical Movie Releases

Arxiv

7+阅读 · 2018年3月1日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

微信扫码咨询专知VIP会员