试验2Vec:利用自译自审的零热临床试验文件相似性搜索 (Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision) - 专知论文

会员服务 ·

0

试验 · 相似度 · 知识 (knowledge) · Learning · INFORMS ·

2022 年 6 月 29 日

Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision

翻译：试验2Vec:利用自译自审的零热临床试验文件相似性搜索

Zifeng Wang,Jimeng Sun

Clinical trials are essential for drug development but are extremely expensive and time-consuming to conduct. It is beneficial to study similar historical trials when designing a clinical trial. However, lengthy trial documents and lack of labeled data make trial similarity search difficult. We propose a zero-shot clinical trial retrieval method, Trial2Vec, which learns through self-supervision without annotating similar clinical trials. Specifically, the meta-structure of trial documents (e.g., title, eligibility criteria, target disease) along with clinical knowledge (e.g., UMLS knowledge base https://www.nlm.nih.gov/research/umls/index.html) are leveraged to automatically generate contrastive samples. Besides, Trial2Vec encodes trial documents considering meta-structure thus producing compact embeddings aggregating multi-aspect information from the whole document. We show that our method yields medically interpretable embeddings by visualization and it gets a 15% average improvement over the best baselines on precision/recall for trial retrieval, which is evaluated on our labeled 1600 trial pairs. In addition, we prove the pre-trained embeddings benefit the downstream trial outcome prediction task over 240k trials.

翻译：临床试验对药物发展至关重要,但极其昂贵和耗时,在设计临床试验时研究类似的历史试验是有益的。然而,冗长的试验文件和缺乏贴标签的数据使得试验难以进行类似的研究。我们建议采用零光临床试验检索方法,即Treato2Vec,该方法通过自我监督的观察学习,而不作类似的临床试验。具体地说,试验文件的元结构(例如标题、资格标准、目标疾病)与临床知识(例如UMLS知识库https://www.nlm.nih.gov/research/umls/index.html)一起,都被用来自动生成对比样本。此外,我们用Treato2Vec编码试验文件,考虑元结构,从而产生集集集整个文件的多层信息的紧凑。我们证明,我们的方法通过直观化产生医学上可解释的嵌入,比我们标记的1600个试验配对试验的精准/召回试验的最佳基线平均改进15%。此外,我们证明前的试判结果预测是下游试验。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

树突膜特异性表达的syndecan-2通过调节actin细胞骨架聚集参与神经病理性痛的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

超高交联聚苯胺的合成、结构及其对重金属离子和溶解性有机物的共吸附机理

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

复杂工况下柔性加工车间的碳足迹规划与调度

国家自然科学基金

0+阅读 · 2013年12月31日

分子水平研究放射性Cs(I)、Sr(II)、Am(III)在高岭石/水界面的吸附形态

国家自然科学基金

0+阅读 · 2012年12月31日

Plk2与ArgBP2相互作用在骨肉瘤细胞迁移中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Toll样受体4信号通路在角膜新生血管形成机制中的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Arxiv

0+阅读 · 2022年8月22日

ARID: A New Dataset for Recognizing Action in the Dark

Arxiv

0+阅读 · 2022年8月19日

A Snowballing Literature Study on Test Amplification

Arxiv

0+阅读 · 2022年8月17日

Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2022年8月17日

A Monotonicity Constrained Attention Module for Emotion Classification with Limited EEG Data

Arxiv

0+阅读 · 2022年8月17日

Field-wise Embedding Size Search via Structural Hard Auxiliary Mask Pruning for Click-Through Rate Prediction

Arxiv

0+阅读 · 2022年8月17日

DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc

Arxiv

0+阅读 · 2022年8月17日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

Deep Research（深度研究）：系统性综述

《革新战术战场空间能力：反无人机系统》报告

【普林斯顿博士论文】用于语音的生成式通用模型

螺旋式开发作为战略资产：美军启示

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Pre-training Tasks for User Intent Detection and Embedding Retrieval in E-commerce Search

Arxiv

0+阅读 · 2022年8月22日

ARID: A New Dataset for Recognizing Action in the Dark

Arxiv

0+阅读 · 2022年8月19日

A Snowballing Literature Study on Test Amplification

Arxiv

0+阅读 · 2022年8月17日

Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Class-Aware Visual Prompt Tuning for Vision-Language Pre-Trained Model

Arxiv

0+阅读 · 2022年8月17日

A Monotonicity Constrained Attention Module for Emotion Classification with Limited EEG Data

Arxiv

0+阅读 · 2022年8月17日

Field-wise Embedding Size Search via Structural Hard Auxiliary Mask Pruning for Click-Through Rate Prediction

Arxiv

0+阅读 · 2022年8月17日

DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc

Arxiv

0+阅读 · 2022年8月17日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition

Arxiv

10+阅读 · 2018年4月28日

相关基金

树突膜特异性表达的syndecan-2通过调节actin细胞骨架聚集参与神经病理性痛的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

超高交联聚苯胺的合成、结构及其对重金属离子和溶解性有机物的共吸附机理

国家自然科学基金

0+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

TIPE2诱导自噬的信号通路及其对巨噬细胞功能的调控

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

复杂工况下柔性加工车间的碳足迹规划与调度

国家自然科学基金

0+阅读 · 2013年12月31日

分子水平研究放射性Cs(I)、Sr(II)、Am(III)在高岭石/水界面的吸附形态

国家自然科学基金

0+阅读 · 2012年12月31日

Plk2与ArgBP2相互作用在骨肉瘤细胞迁移中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

Toll样受体4信号通路在角膜新生血管形成机制中的作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员