一个新多峰密度函数与聚类分析
首发时间:2006-04-11
摘要: 本文提出了一个新多峰密度函数,该密度函数通过似然估计可以得到GCM聚类模型,因此,我们发现了与C-means, FCM, 等算法的概率分布,这对于研究许多聚类算法的性质是有帮助的.
关键词: 有限混合模型,似然估计,模糊聚类
For information in English, please click here
A Novel Multimodal Probability Density Function for Cluster Analysis
Abstract:Up to now, the distributions corresponding to most partitional clustering algorithms have not been found such as C-means and fuzzy c-means (FCM), etc, except the finite mixture distribution corresponding to the expectation and maximization (EM) type clustering algorithm. In this paper, we present a novel multimodal probability density function (PDF), which is proved to induce general c-means (GCM) by the maximum likelihood method under a mild condition. As GCM is a framework of partitional clustering, the densities associated to C-means, FCM, and the mode seeking method, etc. are also found. Such discovery is useful to better understand the properties of many clustering algorithms, for example, it illustrates why C-means or FCM algorithm has a tendency to output clusters with equal size. Furthermore, based on the proposed PDF, we have obtained a theoretical condition that GCM might perform well. Numerical experimental results show that our conclusion is reasonable and useful. Moreover, the proposed multimodal PDF also offers a way to select an appropriate clustering algorithm in theory.
基金:
论文图表:
引用
No.6181534261144729****
同行评议
勘误表
一个新多峰密度函数与聚类分析
评论
全部评论0/1000