DI2:生物医学数据及其应用的事先和多项目独立化 (DI2: prior-free and multi-item discretization ofbiomedical data and its applications) - 专知论文

会员服务 ·

0

离散化 · 统计量 · 稳健性 · MINE · Bioinformatics ·

2021 年 3 月 7 日

DI2: prior-free and multi-item discretization ofbiomedical data and its applications

翻译：DI2:生物医学数据及其应用的事先和多项目独立化

Leonardo Alexandre,Rafael S. Costa,Rui Henriques

Motivation: A considerable number of data mining approaches for biomedical data analysis, including state-of-the-art associative models, require a form of data discretization. Although diverse discretization approaches have been proposed, they generally work under a strict set of statistical assumptions which are arguably insufficient to handle the diversity and heterogeneity of clinical and molecular variables within a given dataset. In addition, although an increasing number of symbolic approaches in bioinformatics are able to assign multiple items to values occurring near discretization boundaries for superior robustness, there are no reference principles on how to perform multi-item discretizations. Results: In this study, an unsupervised discretization method, DI2, for variables with arbitrarily skewed distributions is proposed. DI2 provides robust guarantees of generalization by placing data corrections using the Kolmogorov-Smirnov test before statistically fitting distribution candidates. DI2 further supports multi-item assignments. Results gathered from biomedical data show its relevance to improve classic discretization choices. Software: available at https://github.com/JupitersMight/DI2

翻译：动机:大量生物医学数据分析的数据挖掘方法,包括最先进的联合模型,需要某种形式的数据离散;虽然提出了各种不同的离散方法,但一般都是在一套严格的统计假设下开展工作,这些假设可能不足以处理某一数据集内临床和分子变量的多样性和异质性;此外,生物信息学中越来越多的象征性方法能够将多种物品分配到离散边界附近出现的值,以达到较高的稳健性,但对于如何执行多项目离散没有参考原则;结果:在这项研究中,提出了一种无监督的离散方法,即关于任意偏斜分布的变量的DI2。 DI2通过在统计上适当分配候选人之前使用科尔莫戈罗夫-斯米尔诺夫测试提供数据校正,为普遍化提供了有力的保障。DI2还支持多项目任务。从生物医学数据中收集的结果表明它对于改进传统的离散化选择具有相关性。软件:可在https://github.com/JupiditersMDI2上查阅。

0

相关内容

离散化

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新开放书】医学影像原理与应用，Medical Imaging Principles and Applications

【新开放书】医学影像原理与应用，Medical Imaging Principles and Applications

专知会员服务

90+阅读 · 2019年12月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Discrete-Time Mean Field Control with Environment States

Arxiv

0+阅读 · 2021年4月30日

$L_2$-norm sampling discretization and recovery of functions from RKHS with finite trace

Arxiv

0+阅读 · 2021年4月29日

Testing and estimation of clustered signals

Testing and estimation of clustered signals

Arxiv

0+阅读 · 2021年4月29日

Spectral Discovery of Jointly Smooth Features for Multimodal Data

Arxiv

0+阅读 · 2021年4月29日

ReLearn: A Robust Machine Learning Framework in Presence of Missing Data for Multimodal Stress Detection from Physiological Signals

Arxiv

0+阅读 · 2021年4月29日

Assessing YOLACT++ for real time and robust instance segmentation of medical instruments in endoscopic procedures

Arxiv

0+阅读 · 2021年4月29日

The benefits of acting locally: Reconstruction algorithms for sparse in levels signals with stable and robust recovery guarantees

Arxiv

0+阅读 · 2021年4月28日

On exact discretization of the $L_2$-norm with a negative weight

Arxiv

0+阅读 · 2021年4月28日

5G D2D Transmission Mode Selection Performance & Cluster Limits Evaluation of Distributed Artificial Intelligence and Machine Learning Techniques

Arxiv

0+阅读 · 2021年4月28日

Active learning of tree tensor networks using optimal least-squares

Arxiv

0+阅读 · 2021年4月27日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

人工智能如何用于抵抗COVID-19？Mila这份《AI against COVID-19 》PPT

专知会员服务

48+阅读 · 2020年5月17日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新开放书】医学影像原理与应用，Medical Imaging Principles and Applications

【新开放书】医学影像原理与应用，Medical Imaging Principles and Applications

专知会员服务

90+阅读 · 2019年12月15日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

计算机 | CCF推荐期刊专刊信息5条

计算机 | CCF推荐期刊专刊信息5条

Call4Papers

3+阅读 · 2019年4月10日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Discrete-Time Mean Field Control with Environment States

Arxiv

0+阅读 · 2021年4月30日

$L_2$-norm sampling discretization and recovery of functions from RKHS with finite trace

Arxiv

0+阅读 · 2021年4月29日

Testing and estimation of clustered signals

Testing and estimation of clustered signals

Arxiv

0+阅读 · 2021年4月29日

Spectral Discovery of Jointly Smooth Features for Multimodal Data

Arxiv

0+阅读 · 2021年4月29日

ReLearn: A Robust Machine Learning Framework in Presence of Missing Data for Multimodal Stress Detection from Physiological Signals

Arxiv

0+阅读 · 2021年4月29日

Assessing YOLACT++ for real time and robust instance segmentation of medical instruments in endoscopic procedures

Arxiv

0+阅读 · 2021年4月29日

The benefits of acting locally: Reconstruction algorithms for sparse in levels signals with stable and robust recovery guarantees

Arxiv

0+阅读 · 2021年4月28日

On exact discretization of the $L_2$-norm with a negative weight

Arxiv

0+阅读 · 2021年4月28日

5G D2D Transmission Mode Selection Performance & Cluster Limits Evaluation of Distributed Artificial Intelligence and Machine Learning Techniques

Arxiv

0+阅读 · 2021年4月28日

Active learning of tree tensor networks using optimal least-squares

Arxiv

0+阅读 · 2021年4月27日

微信扫码咨询专知VIP会员