sktime is an open source, Python based, sklearn compatible toolkit for time series analysis developed by researchers at the University of East Anglia (UEA), University College London and the Alan Turing Institute. A key initial goal for sktime was to provide time series classification functionality equivalent to that available in a related java package, tsml, also developed at UEA. We describe the implementation of six such classifiers in sktime and compare them to their tsml equivalents. We demonstrate correctness through equivalence of accuracy on a range of standard test problems and compare the build time of the different implementations. We find that there is significant difference in accuracy on only one of the six algorithms we look at (Proximity Forest). This difference is causing us some pain in debugging. We found a much wider range of difference in efficiency. Again, this was not unexpected, but it does highlight ways both toolkits could be improved.
翻译:sktime is a open source, Python, sklearn兼容的工具包,供东安格利亚大学(UEA),伦敦大学学院和艾伦·图灵研究所的研究人员进行时间序列分析使用。对于sktime来说,一个关键的初始目标是提供与UEA也开发的一个相关的 java 软件包(tsml) 中可用的功能相当的时间序列分类功能。我们描述六个这种分类器在ktime的安装情况,并将其与等同物进行比较。我们通过对一系列标准测试问题的准确性进行等同来显示正确性,并比较不同的实施过程的构建时间。我们发现,我们所查看的六种算法(Proximity Forest)中只有一种算法的准确性差异很大。这种差异使我们在调试中感到某种痛苦。我们发现,效率差异的范围要大得多。我们发现,这并非出乎意料,但确实突出了两种工具都可以改进的方法。