xgboost的全称是eXtreme Gradient Boosting,它是Gradient Boosting Machine的一个C++实现,并能够自动利用CPU的多线程进行并行,同时在算法上加以改进提高了精度。

VIP内容

这本书提供 访问Spark平台的真实文档和示例,以构建大型企业级机器学习应用程序。

在过去的十年里,机器学习取得了一系列惊人的进步。这些突破正在影响我们的日常生活,并对每个行业产生影响。下一代机器学习Spark提供了Spark和Spark MLlib的介绍,并在标准Spark MLlib库之外,向更强大的第三方机器学习算法和库迈进。在这本书的结尾,你将能够通过许多实际的例子和有洞察力的解释将你的知识应用到现实世界的用例中

  • 介绍机器学习、Spark和Spark MLlib 2.4.x
  • 使用XGBoost4J Spark和LightGBM库在Spark上实现闪电般的快速渐变增强
  • 用Spark的隔离林算法检测异常
  • 使用支持多种语言的Spark NLP和Stanford CoreNLP库
  • 使用Alluxio内存数据加速器for Spark优化ML工作负载
  • 使用GraphX和GraphFrames进行图形分析
  • 利用卷积神经网络进行图像识别
  • 利用Keras框架和Spark分布式深度学习库

这本书是给谁的

数据科学家和机器学习工程师,他们希望将自己的知识提升到一个新的水平,使用Spark和更强大的下一代算法和库,而不是标准Spark MLlib库中提供的;同时也是有抱负的数据科学家和工程师的入门书,他们需要机器学习的入门知识,Spark,SparkMLlib。

成为VIP会员查看完整内容
0
81

最新内容

The new coronavirus (known as COVID-19) was first identified in Wuhan and quickly spread worldwide, wreaking havoc on the economy and people's everyday lives. Fever, cough, sore throat, headache, exhaustion, muscular aches, and difficulty breathing are all typical symptoms of COVID-19. A reliable detection technique is needed to identify affected individuals and care for them in the early stages of COVID-19 and reduce the virus's transmission. The most accessible method for COVID-19 identification is RT-PCR; however, due to its time commitment and false-negative results, alternative options must be sought. Indeed, compared to RT-PCR, chest CT scans and chest X-ray images provide superior results. Because of the scarcity and high cost of CT scan equipment, X-ray images are preferable for screening. In this paper, a pre-trained network, DenseNet169, was employed to extract features from X-ray images. Features were chosen by a feature selection method (ANOVA) to reduce computations and time complexity while overcoming the curse of dimensionality to improve predictive accuracy. Finally, selected features were classified by XGBoost. The ChestX-ray8 dataset, which was employed to train and evaluate the proposed method. This method reached 98.72% accuracy for two-class classification (COVID-19, healthy) and 92% accuracy for three-class classification (COVID-19, healthy, pneumonia).

0
0
下载
预览

最新论文

The new coronavirus (known as COVID-19) was first identified in Wuhan and quickly spread worldwide, wreaking havoc on the economy and people's everyday lives. Fever, cough, sore throat, headache, exhaustion, muscular aches, and difficulty breathing are all typical symptoms of COVID-19. A reliable detection technique is needed to identify affected individuals and care for them in the early stages of COVID-19 and reduce the virus's transmission. The most accessible method for COVID-19 identification is RT-PCR; however, due to its time commitment and false-negative results, alternative options must be sought. Indeed, compared to RT-PCR, chest CT scans and chest X-ray images provide superior results. Because of the scarcity and high cost of CT scan equipment, X-ray images are preferable for screening. In this paper, a pre-trained network, DenseNet169, was employed to extract features from X-ray images. Features were chosen by a feature selection method (ANOVA) to reduce computations and time complexity while overcoming the curse of dimensionality to improve predictive accuracy. Finally, selected features were classified by XGBoost. The ChestX-ray8 dataset, which was employed to train and evaluate the proposed method. This method reached 98.72% accuracy for two-class classification (COVID-19, healthy) and 92% accuracy for three-class classification (COVID-19, healthy, pneumonia).

0
0
下载
预览
Top