Kernel smoothers are essential tools for data analysis due to their ability to convey complex statistical information with concise graphical visualisations. Their inclusion in the base distribution and in the many user-contributed add-on packages of the R statistical analysis environment caters well to many practitioners. Though there remain some important gaps for specialised data types, most notably for tibbles (tidy data) within the tidyverse, and for simple features (geospatial data) within geospatial analysis. The proposed eks package fills in these gaps. In addition to kernel density estimation, which is the most widely implemented kernel smoother, this package also caters for more complex data analysis situations, such as density-based classification (supervised learning), mean shift clustering (unsupervised learning), density derivative estimation, density ridge estimation, and significance testing for density differences and for modal regions. We illustrate with experimental data how to obtain and to interpret the statistical graphical analyses for these kernel smoothing methods.
翻译:内核滑动器是数据分析的基本工具,因为它们能够以简洁的图形直观方式传递复杂的统计资料。将内核光滑器纳入基础分布和许多用户贡献的R统计分析环境的附加包中,这对许多实践者来说是很好的。虽然在专门数据类型方面仍然存在一些重大差距,最明显的是整形轨道内的数据(tibles数据),以及地理空间分析中的简单特征(地理空间数据)。拟议的内核包填补了这些差距。除了最广泛实施的内核光滑器内核密度估计外,这一包还满足了更复杂的数据分析情况,例如基于密度的分类(监督学习)、平均转移组合(未监督学习)、密度衍生估计、密度脊脊估计、密度差异和模式区域的重大测试。我们用实验数据说明如何获取和解释这些内核光滑法的统计图解分析。