A common occurrence in many disciplines is the need to assign a set of items into categories or classes with known labels. This is often done by one or more expert raters, or sometimes by an automated process. If these assignments, or 'ratings', are difficult to do, a common tactic is to repeat them by different raters, or even by the same rater multiple times on different occasions. We present an R package, rater, available on CRAN, that implements Bayesian versions of several statistical models that allow analysis of repeated categorical rating data. Inference is possible for the true underlying (latent) class of each item, as well as the accuracy of each rater. The models are based on, and include, the Dawid-Skene model. We use the Stan probabilistic programming language as the main computational engine. We illustrate usage of rater through a few examples. We also discuss in detail the techniques of marginalisation and conditioning, which are necessary for these models but also apply more generally to other models implemented in Stan.
翻译:许多学科中常见的一种常见现象是需要将一组物品分解为类别或类别,标明已知标签。这通常由一个或多个专家评分员进行,有时则通过自动程序进行。如果这些评分或“评分”很难做到,通常的策略是在不同场合由不同的评分员重复,甚至同一评分器多次重复。我们在CRAN上提供了一套R包、评分器,该套软件可以应用巴伊西亚版本的若干统计模型,这些模型可以分析重复的绝对评分数据。每种评分数据的真正底部(相对)等级以及每个评分的准确性都可以推断。这些模型以Dawid-Skene模型为基础,并且包括Dawid-Skene模型。我们用Stan概率性编程语言作为主要的计算引擎。我们通过几个例子来说明标分数器的使用情况。我们还详细讨论边缘化和调节技术,这些技术对这些模型是必要的,但也更一般地适用于在Stan实施的其他模型。