Biclustering is widely used in different kinds of fields including gene information analysis, text mining, and recommendation system by effectively discovering the local correlation between samples and features. However, many biclustering algorithms will collapse when facing heavy-tailed data. In this paper, we propose a robust version of convex biclustering algorithm with Huber loss. Yet, the newly introduced robustification parameter brings an extra burden to selecting the optimal parameters. Therefore, we propose a tuning-free method for automatically selecting the optimal robustification parameter with high efficiency. The simulation study demonstrates the more fabulous performance of our proposed method than traditional biclustering methods when encountering heavy-tailed noise. A real-life biomedical application is also presented. The R package RcvxBiclustr is available at https://github.com/YifanChen3/RcvxBiclustr.
翻译:在基因信息分析、文本挖掘和建议系统等不同领域广泛使用生物集群,有效发现样本和特征之间的本地关联。 但是,许多双集群算法在面临重尾数据时会崩溃。 在本文中,我们提出一个稳健的版本的Convex双集群算法与Huber损失。然而,新引入的强力化参数给选择最佳参数带来了额外负担。因此,我们提议了一个自动选择效率高的最佳强力化参数的无调方法。模拟研究表明,在遇到重尾噪音时,我们拟议的方法比传统的双集群方法表现得更出色。还介绍了一个真实的生物医学应用。R包件RcvxBIlustr可在https://github.com/Yifan3/RcvxBlustr查阅。