In-vivo toxicological studies are characterized by multiple primary endpoints with quite different scales. Whereas guidelines and publications provide various statistical tests for normally distributed endpoints (such as organ weights) and proportions (such as tumor rates), few approaches are available for graded histopathological findings, such as 0, +, ++, +++. This represents a basic contradiction of the statistical analysis because these graded findings sometimes show a high predictive value for potential toxic effects. Here we discuss different methods comparatively, especially from the viewpoints of i) designs for very small sample sizes and ii) interpretability by toxicologists. A new approach is recommended where a simultaneous test is performed over all class combinations of score levels, such as (0, +) vs (++, +++). Corresponding R code is provided by way of a data example.
 翻译:活性毒理学研究的特点是具有不同尺度的多个初级端点;准则和出版物为通常分布的端点(如器官重量)和比例(如肿瘤率)提供各种统计测试,但对于分级的病理学研究结果,如0、+、++、++等,几乎没有什么办法可供使用。这与统计分析基本矛盾,因为这些分级结果有时显示潜在毒性效应的预测值很高。这里我们讨论不同的方法,特别是从以下角度比较:一)非常小样本大小的设计,二)毒理学家的解释性。建议采用一种新的方法,即对得分水平的所有等级组合,如(0、+)和(++、+)等进行同时测试。以数据实例的方式提供了相应的R代码。