Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. This can be done by treating a PK curve as a time series object and subsequently utilizing the extensive body of research related to the clustering of time series data objects. In this paper, we introduce hierarchical clustering within the context of clustering PK curves and find it to be effective at identifying similar-shaped PK curves and informative for understanding patterns of PK curves via its dendrogram data visualization. We also examine many dissimilarity measures between time series objects to identify Euclidean distance as generally most appropriate for clustering PK curves. We further show that dynamic time warping, Fr\'echet, and structure-based measures of dissimilarity like correlation may produce unexpected results. Finally, we apply these methods to a dataset of 250 PK curves as an illustrative case study to demonstrate how the clustering of PK curves can be used as a descriptive tool for summarizing and visualizing complex PK data, which may enhance the study of pharmacogenomics in the context of precision medicine.
翻译:制药研究人员正在不断寻找改进药物开发过程和病人结果的技术。最近感兴趣的一个领域是药理学中机器学习应用的潜力。还没有进行密切研究的其中一个应用是等离子浓缩时间曲线的未经监督的组合,以下是药理动力(PK)曲线。这可以通过将PK曲线作为时间序列对象处理,然后利用与时间序列数据对象组合有关的大量研究来完成。在本文中,我们在组合PK曲线的范围内引入等级分组,发现它能够有效地通过登地数据可视化来识别类似型PK曲线和了解PK曲线模式的信息。我们还研究时间序列物体之间的许多不同措施,以确定Euclidean距离通常最适合于聚PK曲线。我们进一步表明动态时间扭曲、Fr\'echet和结构上的不相似性测量方法可能会产生出意想不到的结果。最后,我们将这些方法应用于250PK曲线的数据集,作为用于分析PK曲线图案数据背景的示例性案例研究,用以显示如何将PK模型的精确性数据作为分析工具,从而将PK模型的精确度用于分析。