Rule-based models are often used for data analysis as they combine interpretability with predictive power. We present RuleKit, a versatile tool for rule learning. Based on a sequential covering induction algorithm, it is suitable for classification, regression, and survival problems. The presence of a user-guided induction facilitates verifying hypotheses concerning data dependencies which are expected or of interest. The powerful and flexible experimental environment allows straightforward investigation of different induction schemes. The analysis can be performed in batch mode, through RapidMiner plug-in, or R package. A documented Java API is also provided for convenience. The software is publicly available at GitHub under GNU AGPL-3.0 license.
翻译:基于规则的模型往往用于数据分析,因为它们结合了可解释性和预测力。我们介绍了规则Kit,这是规则学习的多用途工具。基于连续的上岗算法,它适合于分类、回归和生存问题。用户指导的上岗有助于核实关于预期或感兴趣的数据依赖性的假设。强大而灵活的实验环境可以直接调查不同的上岗计划。分析可以分批进行,通过RapidMiner插件或R软件包进行。为方便起见,也可以提供有文件记录的爪哇API。GNU AGPL-3.0许可证下的GitHub公开提供该软件。