一种基于改进SLIQ决策树分类算法的应用研究
首发时间:2007-12-31
摘要:在数据挖掘的分类问题中,决策树已经被认为是非常有效的一种方法,本文提出了一种改进的SLIQ(Mehta et. al,1996)决策树分类算法,克服了原有SLIQ算法需要大量计算决策树每个节点的吉尼指数(gini index)的缺点。为了获得决策树中每个节点的最佳分裂属性,原有SLIQ算法不得不计算所有属性的所有属性值的吉尼指数。改进的SLIQ算法能够有效地减少计算的复杂度,且算法不需要将所有属性的所有属性值的吉尼指数全部计算,而是通过计算不同范围内的属性值就可以达到同样的效果。本文结合实际生活中的实例,将该算法与原有SLIQ算法和基于人工神经网络的分类算法应用结果比较,实验结果表明该算法的分类准确率远远高于SLIQ算法和基于人工神经网络的分类算法。
For information in English, please click here
Decision Tree Algorithm application for Classification in Data Mining based on Extended SLIQ Algorithm
Abstract:Decision trees have been found very effective for classification especially in Data Mining. This paper aims at improving the performance of the SLIQ decision tree algorithm (Mehta et. al,1996) for classification in data mining The drawback of this algorithm is that large number of gini indices have to be computed at each node of the decision tree. In order to decide which attribute is to be split at each node, the gini indices have to becomputed for all the attributes and for each successive pair of values for all patterns which have not beenclassified. An improvement over the SLIQ algorithm has been proposed to reduce the computational complexity. In this algorithm, the gini index is computed not for every successive pair of values of an attribute but over different ranges of attribute values. Classification accuracy of this technique was compared with the existing SLIQ and the Neural Network technique on real life datasets, It was observed that the decision tree constructed using the proposed decision tree algorithm gave far better classification accuracy than the classification accuracy obtained using the SLIQ algorithm and the neural network classification technique.
Keywords: Decsion Tree Data-mining SLIQ Algorithm
论文图表:
引用
No.1761218133511990****
同行评议
共计0人参与
勘误表
一种基于改进SLIQ决策树分类算法的应用研究
评论
全部评论0/1000