This paper aims to provide an approach for automatic coding of physician-patient communication transcripts to improve patient-centered communication (PCC). PCC is a central part of high-quality health care. To improve PCC, dialogues between physicians and patients have been recorded and tagged with predefined codes. Trained human coders have manually coded the transcripts. Since it entails huge labor costs and poses possible human errors, automatic coding methods should be considered for efficiency and effectiveness. We adopted three machine learning algorithms (Na\"ive Bayes, Random Forest, and Support Vector Machine) to categorize lines in transcripts into corresponding codes. The result showed that there is evidence to distinguish the codes, and this is considered to be sufficient for training of human annotators.
翻译:本文旨在为医生-病人交流记录自动编码提供一种方法,以改进以病人为中心的交流。PCC是高质量保健的一个核心部分。为了改进PCC,医生和病人之间的对话已经记录下来,并用预先定义的代码标记下来。受过训练的人类编码员对笔录进行了人工编码。由于它涉及巨大的劳动成本并可能造成人为错误,因此应考虑自动编码方法的效率和效力。我们采用了三种机器学习算法(Na\“ive Bayes”、“随机森林”和“支持矢量机”)将笔录中的线条划归为相应的代码。结果显示,有证据可以区分代码,这被认为足以培训人类告示者。