BOXHED 2.0:在生存分析中可缩放的功能数据增强 (BoXHED 2.0: Scalable boosting of functional data in survival analysis) - 专知论文

会员服务 ·

0

泛函 · Boosting（一种模型训练加速方式） · Integration · 数据预处理 · xgboost ·

2021 年 3 月 23 日

BoXHED 2.0: Scalable boosting of functional data in survival analysis

翻译：BOXHED 2.0:在生存分析中可缩放的功能数据增强

Arash Pakbin,Xiaochen Wang,Bobak J. Mortazavi,Donald K. K. Lee

from arxiv, 9 pages, 2 tables, 2 figures

Modern applications of survival analysis increasingly involve time-dependent covariates, which constitute a form of functional data. Learning from functional data generally involves repeated evaluations of time integrals which is numerically expensive. In this work we propose a lightweight data preprocessing step that transforms functional data into nonfunctional data. Boosting implementations for nonfunctional data can then be used, whereby the required numerical integration comes for free as part of the training phase. We use this to develop BoXHED 2.0, a quantum leap over the tree-boosted hazard package BoXHED 1.0. BoXHED 2.0 extends BoXHED 1.0 to Aalen's multiplicative intensity model, which covers censoring schemes far beyond right-censoring and also supports recurrent events data. It is also massively scalable because of preprocessing and also because it borrows from the core components of XGBoost. BoXHED 2.0 supports the use of GPUs and multicore CPUs, and is available from GitHub: www.github.com/BoXHED.

翻译：现代生存分析应用越来越多地涉及基于时间的共变式,它们构成了一种功能性数据的形式。从功能数据学习通常涉及对时间组成部分的反复评价,而时间组成部分的数值昂贵。在这项工作中,我们提议了将功能数据转换为不功能数据的轻量数据预处理步骤。然后可以使用不功能数据的强化实施,从而将所需的数字整合免费作为培训阶段的一部分。我们用它来开发BoXHED 2.0,这是从树起作用的危险包BoxHED 1.0.BoxHED 2.0向Aalen的多倍增强度模型的飞跃。BoxHED 1.0 延伸至Aalen的多倍增强度模型,该模型涵盖审查计划,远远超出右检查范围,并且也支持经常性事件数据。由于预处理以及从XGBoust的核心组成部分借用,它也可以大规模扩展。BoxHED 2.0支持使用GPUs和多核心CPU,并且可从GitHub:www.github.com/BoXHED获得。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

专知会员服务

103+阅读 · 2020年4月22日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

50+阅读 · 2019年12月10日

【干货】大数据入门指南：Hadoop、Hive、Spark、 Storm等

【干货】大数据入门指南：Hadoop、Hive、Spark、 Storm等

专知会员服务

97+阅读 · 2019年12月4日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

RF(随机森林)、GBDT、XGBoost面试级整理

RF(随机森林)、GBDT、XGBoost面试级整理

数据挖掘入门与实战

7+阅读 · 2018年2月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Conformal prediction interval for dynamic time-series

Arxiv

0+阅读 · 2021年5月15日

Privug: Quantifying Leakage using Probabilistic Programming for Privacy Risk Analysis

Arxiv

0+阅读 · 2021年5月15日

SMIM: a unified framework of Survival sensitivity analysis using Multiple Imputation and Martingale

Arxiv

0+阅读 · 2021年5月14日

Memory Reduction using a Ring Abstraction over GPU RDMA for Distributed Quantum Monte Carlo Solver

Arxiv

1+阅读 · 2021年5月13日

Scalable End-to-End RF Classification: A Case Study on Undersized Dataset Regularization by Convolutional-MST

Arxiv

0+阅读 · 2021年5月13日

Fourier Growth of Parity Decision Trees

Arxiv

0+阅读 · 2021年5月13日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning

Arxiv

4+阅读 · 2018年1月11日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

数据预处理

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

【实用书】掌握Python数据分析，282页pdf，Mastering Python Data Analysis

专知会员服务

103+阅读 · 2020年4月22日

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

机器学习与物理科学（Machine learning and the physical sciences），附44页pdf

专知会员服务

50+阅读 · 2019年12月10日

【干货】大数据入门指南：Hadoop、Hive、Spark、 Storm等

【干货】大数据入门指南：Hadoop、Hive、Spark、 Storm等

专知会员服务

97+阅读 · 2019年12月4日

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

【ECML-PKDD 2019】基于bagged-trees学习的可解释生存梯度提升模型（Interpretable survival gradient boosting models with bagged trees base learners）

专知会员服务

6+阅读 · 2019年12月1日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

lightgbm algorithm case of kaggle（上）

lightgbm algorithm case of kaggle（上）

R语言中文社区

8+阅读 · 2018年3月20日

RF(随机森林)、GBDT、XGBoost面试级整理

RF(随机森林)、GBDT、XGBoost面试级整理

数据挖掘入门与实战

7+阅读 · 2018年2月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Conformal prediction interval for dynamic time-series

Arxiv

0+阅读 · 2021年5月15日

Privug: Quantifying Leakage using Probabilistic Programming for Privacy Risk Analysis

Arxiv

0+阅读 · 2021年5月15日

SMIM: a unified framework of Survival sensitivity analysis using Multiple Imputation and Martingale

Arxiv

0+阅读 · 2021年5月14日

Memory Reduction using a Ring Abstraction over GPU RDMA for Distributed Quantum Monte Carlo Solver

Arxiv

1+阅读 · 2021年5月13日

Scalable End-to-End RF Classification: A Case Study on Undersized Dataset Regularization by Convolutional-MST

Arxiv

0+阅读 · 2021年5月13日

Fourier Growth of Parity Decision Trees

Arxiv

0+阅读 · 2021年5月13日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Learning to Importance Sample in Primary Sample Space

Learning to Importance Sample in Primary Sample Space

Arxiv

5+阅读 · 2018年8月23日

MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning

Arxiv

4+阅读 · 2018年1月11日

A Big Data Analysis Framework Using Apache Spark and Deep Learning

Arxiv

3+阅读 · 2017年11月25日

微信扫码咨询专知VIP会员