多标记医学诊断数据建模方法的研究

项目名称： 多标记医学诊断数据建模方法的研究

项目编号： No.61273305

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 李国正

作者单位： 中国中医科学院

项目金额： 82万元

中文摘要： 解决科学领域大规模数据分析的挑战性任务将带动机器学习的发展。多证候中医诊断病例是典型的多标记数据。已有多标记建模方法欠缺考虑中医诊断数据的特点：特征由望闻问切四种来源的症状组成、各标记在病例中出现频次严重不均衡、丰富的医学理论未在建模中有效利用。本项目计划从多证侯中医诊断数据建模的典型应用出发，研究新型的多标记数据建模方法：一是基于集成学习的望闻问切四诊症状融合建模方法；二是嵌入特定基分类器的标记不均衡克服建模方法；三是提炼中医诊断理论为规则和约束的先验知识利用建模方法。新方法将在高血压和失眠等多证候中医诊断数据和其它科学领域的公开数据上进行验证，旨在提高特定医学领域任务的建模效果，也为其它科学领域的数据分析提供工具和参考。

中文关键词： 大数据；机器学习；多标记学习；中医；生物信息学

英文摘要： To solve the challenging task of massive scientific data processing will promote the development of machine learning techniques. Medical records with multi-syndrome in traditional Chinese medicine (TCM) are multi-label data. Existing multi-label learning methods do not consider the characteristics of the TCM diagnosis data: there are four kinds of symptoms like watching, listening, inquiring and pulse taking; there exists imbalance among the labels, there are fruitful theories for diagnosis which are not utilized in modeling. This project plan to develop novel multi-label learning techniques from the typical applications of multi-syndrome medical diagnosis data modeling: the first is to develop multi-label information fusion methods for four different symptom collection; the second is to invent imbalance multi-label learning methods embedded specific base learner; the third is to study multi-label learning methods intergrating prior knowledge from medical diagnosis theory. Novel algorithms will be applied to hypertension and insonoia data sets and other public scientific data sets. This study aims to improve modeling accuray and provide tools and reference for other scientific data analysis.

英文关键词： Big Data；Machine Learning；Multi-label Learning；Traditional Chinese Medicine；Bioinformatics

成为VIP会员查看完整内容

相关内容

大数据

关注 270

从各种各样类型的数据中，快速获得有价值信息的能力，就是大数据技术。明白这一点至关重要，也正是这一点促使该技术具备走向众多企业的潜力。大数据的4个“V”，或者说特点有四个层面：第一，数据体量巨大。从TB级别，跃升到PB级别；第二，数据类型繁多。前文提到的网络日志、视频、图片、地理位置信息等等。第三，价值密度低。以视频为例，连续不间断监控过程中，可能有用的数据仅仅有一两秒。第四，处理速度快。

因果推断在医药图像的应用：数据缺失和数据不匹配

专知会员服务

58+阅读 · 2022年4月2日

迁移学习方法在医学图像领域的应用综述

专知会员服务

61+阅读 · 2022年1月6日

【NeurIPS2021】基于Transformer的多示例学习算法在组织病理学图像分类中的应用

专知会员服务

25+阅读 · 2021年10月19日

肺部影像解剖结构分割数据集及应用

专知会员服务

28+阅读 · 2021年10月6日