This study presents the m-arcsinh Kernel ('m-ar-K') Fast Independent Component Analysis ('FastICA') method ('m-ar-K-FastICA') for feature extraction. The kernel trick has enabled dimensionality reduction techniques to capture a higher extent of non-linearity in the data; however, reproducible, open-source kernels to aid with feature extraction are still limited and may not be reliable when projecting features from entropic data. The m-ar-K function, freely available in Python and compatible with its open-source library 'scikit-learn', is hereby coupled with FastICA to achieve more reliable feature extraction in presence of a high extent of randomness in the data, reducing the need for pre-whitening. Different classification tasks were considered, as related to five (N = 5) open access datasets of various degrees of information entropy, available from scikit-learn and the University California Irvine (UCI) Machine Learning repository. Experimental results demonstrate improvements in the classification performance brought by the proposed feature extraction. The novel m-ar-K-FastICA dimensionality reduction approach is compared to the 'FastICA' gold standard method, supporting its higher reliability and computational efficiency, regardless of the underlying uncertainty in the data.
翻译:本研究展示了用于地貌提取的 m-ar-ar-K(m-ar-K) 快速独立组件分析法('FastICA') 快速独立部件分析法('m-ar-K-FastICA' ) 。 内核魔术使维度减少技术能够捕捉数据中更高程度的非线性化; 然而,可复制的、开放源的内核协助地貌提取,仍然有限,在从昆虫数据中投影时可能不可靠。 在Python中可以免费获取并与开放源图书馆的“scikit-learn” 兼容的 m-ar-K函数,在此同时,在数据高度随机性的情况下实现更可靠的特性提取,从而减少了对白前的需要。 考虑过不同的分类任务,与五个(N= = 5) 有关不同程度信息的公开存取数据集,从Scikit-learn和加州大学Irvin(ICI) 机器学习库中可以获取的。 实验性结果表明,在高度数据解析中可以改进了数据的精确性方法。