Kernel smoothers are essential tools for data analysis due to their ability to convey complex statistical information with concise graphical visualisations. Their inclusion in the base distribution and in the many user-contributed add-on packages of the R statistical analysis environment caters well to many practitioners. Though there remain some important gaps for specialised data types, most notably for tibbles (tidy data) within the tidyverse, and for simple features (geospatial data) within geospatial analysis. The proposed eks package fills in these gaps. In addition to kernel density estimation, this package also caters for more complex data analysis situations, such as density derivative estimation, density-based classification (supervised learning) and mean shift clustering (unsupervised learning). We illustrate with experimental data how to obtain and to interpret the statistical visualisations for these kernel smoothing methods.
翻译:瑞可平滑方法是数据分析的基本工具,因其能够用简洁的图形可视化传达复杂的统计信息而倍受青睐。由于瑞可平滑方法被包含在R统计分析环境的基本分布和许多用户贡献的附加包中,非常适合许多从业者。虽然仍然存在一些针对特殊数据类型的重要空白,尤其是针对清晰数据(tidy data)的散点图和针对地理空间数据的简单特征(geospatial data)的瑞可平滑方法。本文介绍了eks包并填补了这些空白。除了核密度估计外,这个包还适用于更复杂的数据分析情形,如密度导数估计、基于密度的分类(有监督学习)和均值漂移聚类(无监督学习)。我们通过实验数据来说明如何获取和解释这些瑞可平滑方法的统计可视化。