中国科技论文在线

上传时间

2005年08月02日

【期刊论文】Ensembling neural networks: Many could be better than all☆

周志华， Zhi-Hua Zhou*， Jianxin Wu， Wei Tang

Artificial Intelligence 137 (2002) 239-263，-0001，（）：

-1年11月30日

Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and classification, which reveals that it may be better to ensemble many instead of all of the neural networks at hand. This result is interesting because at present, most approaches ensemble all the available neural networks for prediction. Then, in order to show that the appropriate neural networks for composing an ensemble can be effectively selected from a set of available neural networks, an approach named GASEN is presented. GASEN trains a number of neural networks at first. Then it assigns random weights to those networks and employs genetic algorithm to evolve the weights so that they can characterize to some extent the fitness of the neural networks in constituting an ensemble. Finally it selects some neural networks based on the evolved weights to make up the ensemble. A large empirical study shows that, compared with some popular ensemble approaches such as Bagging and Boosting, GASEN can generate neural network ensembles with far smaller sizes but stronger generalization ability. Furthermore, in order to understand the working mechanism of GASEN, the bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance.

Neural networks， Neural network ensemble， Machine learning， Selective ensemble， Boosting， Bagging， Genetic algorithm， Bias-variance decomposition

203浏览
0点赞
0收藏
0分享
121下载
0

引用

上传时间

2005年08月02日

【期刊论文】Hybrid decision tree

周志华， Zhi-Hua Zhou*， Zhao-Qian Chen

Know ledge-Based Systems 15 (2002)) 515-528，-0001，（）：

-1年11月30日

摘要

In this paper, a hybrid learning approach named hybrid decision tree (HDT) is proposed. HDT simulates human reasoning using symbolic leaming to do qualitative analysis and using neurallearning to do subsequent quantitative analysis. It generates the trunk of a binary HDT according to the binary inormation gain r atio critetion in an instance space definde by only original unordered attributes. If unordered attributes cannot further distingguish training examples falling into a leaf node whose diversity is beyond the diversity-threshold, then the node is marked as a dummy node. After all those dummy nodes are marked, a speific feedforward neural netword namde FANNC that is trainde in an instance space definde by only original ordered attributes is exploited to accomplish the leaming task. Moreover, this paper distinguishes three kinds of inremental learning tasks. Two incremental leaming procedures designde for example-incremental learning with different storage requirements are provided, which enables HDT to deal gracefully with data sets where new data are freaquently appended. Also a hypothesis-driven constructive induction mechanism is provided, which enables HDT to generate compact concept descriptions.

Machine learning， Knowledge acquisition， Decision tree， Neural networks， Hybrid learning， Incremental learning， Constructive induction

164浏览
0点赞
0收藏
0分享
70下载
0

引用

上传时间

2005年08月02日

【期刊论文】Concise Papers NeC4.5: Neural Ensemble Based C4.5

周志华， Zhi-Hua Zhou， Member， IEEE， and Yuan Jiang

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 16, NO.6, JUNE 2004，-0001，（）：

-1年11月30日

摘要

Decision tree is with good comprehensibility while neural network ensemble is with strong generalization ability. In this paper, these merits are integrated into a novel decision tree algorithm NeC4.5. This algorithm trains a neural network ensemble at first. Then, the trained ensemble is employed to generate a new training set through replacing the desired class labels of the original training examples with those output from the trained ensemble. Some extra training examples are also generated from the trained ensemble and added to the new training set. Finally, a C4.5 decision tree is grown from the new training set. Since its learning results are decision trees, the comprehensibility of NeC4.5 is better than that of neural network ensemble. Moreover, experiments show that the generalization ability of NeC4.5 decision trees can be better than that of C4.5 decision trees.

Machine learning,， decision tree， neural networks， ensemble learning， neural network ensemble， generalization， comprehensibility.，

136浏览
0点赞
0收藏
0分享
46下载
0

引用

上传时间

2005年08月02日

【期刊论文】Extracting symbolic rules from trained neural network ensembles

周志华， Zhi-Hua Zhou*， Yuan Jiang and Shi-Fu Chen

AI Communications 16 (2003) 3-15，-0001，（）：

-1年11月30日

摘要

Neural network ensemble can significantly improve the generalization ability of neural network based systems. However, its comprehensibility is even worse than that of a single neural network because it comprises a collection of individual neural networks. In this paper, an approach named REFNE is proposed to improve the comprehensibility of trained neural network ensembles that perform classification tasks. REFNE utilizes the trained ensembles to generate instances and then extracts symbolic rules from those instances. It gracefully breaks the ties made by individual neural networks in prediction. It also employs specific discretization scheme, rule form, and fidelity evaluation mechanism. Experiments show that with different configurations, REFNE can extract rules with good fidelity that well explain the function of trained neural network ensembles, or rules with strong generalization ability that are even better than the trained neural network ensembles in prediction.

Neural networks， neural network ensembles， rule extraction， machine learning， comprehensibility

136浏览
0点赞
0收藏
0分享
138下载
0

引用

上传时间

2005年08月02日

【期刊论文】Three perspectives of data mining

周志华， Zhi-Hua Zhou

Artificial Intelligence 143 (2003) 139-146，-0001，（）：

-1年11月30日

摘要

This paper reviews three recent books on data mining written from three different perspectives, i.e., databases, machine learning, and statistics. Although the exploration in this paper is suggestive instead of conclusive, it reveals that besides some common properties, different perspectives lay strong emphases on different aspects of data mining. The emphasis of the database perspective is on efficiency because this perspective strongly concerns the whole discovery process and huge data volume. The emphasis of the machine learning perspective is on effectiveness because this perspective is heavily attracted by substantive heuristics working well in data analysis although they may not always be useful. As for the statistics perspective, its emphasis is on alidity because this perspective cares much for mathematical soundness behind mining methods.

Data mining， Databases， Machine learning， Statistics

121浏览
0点赞
0收藏
0分享
246下载
0

引用