Improved support vector machine algorithm for heterogeneous data
Pattern Recognition，2015，48（6）：2072-2083 | 2015年06月01日 | doi.org/10.1016/j.patcog.2014.12.015
A support vector machine (SVM) is a popular algorithm for classification learning. The classical SVM effectively manages classification tasks defined by means of numerical attributes. However, both numerical and nominal attributes are used in practical tasks and the classical SVM does not fully consider the difference between them. Nominal attributes are usually regarded as numerical after coding. This may deteriorate the performance of learning algorithms. In this study, we propose a novel SVM algorithm for learning with heterogeneous data, known as a heterogeneous SVM (HSVM). The proposed algorithm learns an mapping to embed nominal attributes into a real space by minimizing an estimated generalization error, instead of by direct coding. Extensive experiments are conducted, and some interesting results are obtained. The experiments show that HSVM improves classification performance for both nominal and heterogeneous data.