Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the shortcoming of ID3's inclining to choose attributes with many values is discussed, and then a new decision tree algorithm which is improved version of ID3. In our proposed algorithm attributes are divided into groups and then we apply the selection measure 5 for these groups. If information gain is not good then again divide attributes values into groups. These steps are done until we get good classification/misclassification ratio. The proposed algorithms classify the data sets more accurately and efficiently.
翻译:决策树是上岗研究和数据挖掘的重要方法,主要用于模型分类和预测。ID3算法是迄今为止决策树中最常用的算法。本文讨论了ID3选用具有多种价值的属性的缺点,然后又讨论了新的决策树算法,该算法改进了ID3的版本。 在我们提议的算法属性中,将分类为组,然后我们对这些组群采用选择措施5。如果信息收益不好,然后又将属性值划分为组。这些步骤将完成,直到我们获得良好的分类/误分类比率。提议的算法将数据集更准确、更高效地分类。