Violence risk assessment in psychiatric institutions enables interventions to avoid violence incidents. Clinical notes written by practitioners and available in electronic health records (EHR) are valuable resources that are seldom used to their full potential. Previous studies have attempted to assess violence risk in psychiatric patients using such notes, with acceptable performance. However, they do not explain why classification works and how it can be improved. We explore two methods to better understand the quality of a classifier in the context of clinical note analysis: random forests using topic models, and choice of evaluation metric. These methods allow us to understand both our data and our methodology more profoundly, setting up the groundwork to work on improved models that build upon this understanding. This is particularly important when it comes to the generalizability of evaluated classifiers to new data, a trustworthiness problem that is of great interest due to the increased availability of new data in electronic format.
翻译:精神病机构的暴力风险评估有助于采取干预措施,避免暴力事件的发生。从业者编写并在电子健康记录(EHR)中提供的临床说明是宝贵的资源,很少能充分利用其全部潜力。以前的研究试图利用这些说明评估精神病患者的暴力风险,但并没有解释为什么分类工作有效,以及如何改进分类工作。我们探索了两种方法,以便在临床说明分析中更好地了解分类人员的质量:使用专题模型的随机森林,以及评估指标的选择。这些方法使我们能够更深入地理解我们的数据和方法,为在这种理解的基础上改进模型奠定基础。这对于评估分类人员对新数据的一般可使用性特别重要,因为以电子格式提供的新数据越来越多,信任度问题非常有意义。