Bayesian 分两部分的 " 量积回归模型 ", 引文分析模型 (A Bayesian Two-part Hurdle Quantile Regression Model for Citation Analysis)

Quantile regression is a technique to analyse the effects of a set of independent variables on the entire distribution of a continuous response variable. Quantile regression presents a complete picture of the effects on the location, scale, and shape of the dependent variable at all points, not just at the mean. This research focuses on two challenges for the analysis of citation counts by quantile regression: discontinuity and substantial mass points at lower counts, such as zero, one, two, and three. A Bayesian two-part hurdle quantile regression model was proposed by King and Song (2019) as a suitable candidate for modeling count data with a substantial mass point at zero. Their model allows the zeros and non-zeros to be modeled independently but simultaneously. It uses quantile regression for modeling the nonzero data and logistic regression for modeling the probability of zeros versus nonzeros. Nevertheless, the current paper shows that substantial mass points also at one, two, and three for citation counts will nearly certainly affect the estimation of parameters in the quantile regression part of the model in a similar manner to the mass point at zero. We update the King and Song model by shifting the hurdle point from zero to three, past the main mass points. The new model delivers more accurate quantile regression for moderately to highly cited articles, and enables estimates of the extent to which factors influence the chances that an article will be low cited. To illustrate the advantage and potential of this method, it is applied separately to both simulated citation counts and also seven Scopus fields with collaboration, title length, and journal internationality as independent variables.

翻译：量化回归是一种分析一套独立变量对整个连续响应变量分布的整体分布的影响的技术。量化回归展示了所有点、不仅是平均点对依赖变量的位置、规模和形状的全面影响。本研究侧重于分析以量化回归度计算引注数的两种挑战: 不连续和大量质量点(如0、 1、 2和 3) 。 King 和 Song (2019) 提出了一个巴耶西亚双部分障碍量化回归模型( 2019), 作为以相当质量点为相当质量点的计算数据模型的合适候选人。量回归模型让零和非零位变量在所有点, 不仅在平均点上同时进行模拟。此项研究侧重于以非零度数据和后勤回归值来模拟零度与非零点之间的概率。然而, 本文显示, 大量质量点为1、 2 和 3 参数回归值模型中的大部分参数将几乎影响该模型中量化部分的参数的估算值, 以类似的方式应用相当质量点为相当的质量点。它们的模型允许零和非零位模型同时同时进行模型的模型, 我们更新了和演示域将演示域, 以向方向展示和展示展示的将向向方向展示向展示展示的向向方向方向的展示向向展示展示展示向展示展示展示方向方向向方向展示向向方向向方向方向方向方向展示方向方向向向向向向向向向方向方向向方向方向方向向向向展示向展示展示方向展示展示展示向展示展示展示向向方向方向方向向向向向向展示方向向向向展示展示展示展示展示向向向向向展示展示向方向向展示向向向向向方向向向向展示展示向向向方向方向向向方向向

相关内容

MASS

关注 0

MASS：IEEE International Conference on Mobile Ad-hoc and Sensor Systems。 Explanation：移动Ad hoc和传感器系统IEEE国际会议。 Publisher：IEEE。 SIT： http://dblp.uni-trier.de/db/conf/mass/index.html

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

TensorFlow深度学习，从线性回归到强化学习的深度学习（TensorFlow for Deep Learning From Linear Regression to Reinforcement Learning），附页256页pdf

专知会员服务

46+阅读 · 2020年1月1日