不同相似度测量方法的K均值聚类分析

何明胜; 高占春; 蒋砚军

0
0
浏览
下载

摘要
关键词
基金信息
论文图表
同行评议
相关论文
评论

不同相似度测量方法的K均值聚类分析

首发时间：2012-11-22

何明胜 ¹
何明胜，（1987-），男，硕士研究生，主要研究方向：计算机网络及应用。
高占春 ²
高占春，（1967-），男，副教授，主要研究方向：计算机网络及应用
蒋砚军 ²
蒋砚军，（1966-），男，副教授，主要研究方向：计算机网络及应用

1、北京邮电大学计算机学院，北京 100876
2、北京邮电大学计算机学院,北京 100876

摘要：近年来，由于海量数据的普遍性，数据挖掘受到了广泛的关注。聚类作为一个无监督的学习算法，是模式识别、机器学习和数据挖掘等领域的一项重要研究内容。K均值（K-Means）算法是基于划分的一种聚类算法，很多经典的聚类任务都选择该算法作为研究对象。实验采用不同的相似度测量方法，通过UCI的知名数据集Iris在K均值算法上进行聚类实验，从聚类结果错误率和运行效率两个方面对比分析和讨论，为聚类分析研究提供有益的参考。

关键词：聚类分析 K-Means 相似度 Mahout

For information in English, please click here

K-means clustering analysis of the different similarity measures

HE Mingsheng ¹
何明胜，（1987-），男，硕士研究生，主要研究方向：计算机网络及应用。
GAO Zhanchun ¹
高占春，（1967-），男，副教授，主要研究方向：计算机网络及应用
JIANG Yanjun ¹
蒋砚军，（1966-），男，副教授，主要研究方向：计算机网络及应用

1、School of Computer Science, Beijing University of Posts and Telecommunication, Beijing 100876

Abstract：In recent years, due to the universality of the vast amounts of data, data mining has been widespread concern. As an unsupervised learning algorithm, Clustering is an important research content in pattern recognition, machine learning and data mining. K-Means algorithm is a kind of partitioning clustering algorithm and be selected for many classical clustering tasks. In the paper, it use the UCI well-known dataset Iris as the K-Means algorithm input based on different similarity measures for clustering experiments, and comparative analysis and discussion of two aspects of error rate and the running efficiency of the clustering algorithm in Mahout. These can provide useful reference for cluster analysis study.

Keywords： clustering analysis K-MEANS similarity Mahout

基金：

论文图表：

引用

导出参考文献

.txt

.ris

.doc

何明胜，高占春，蒋砚军. 不同相似度测量方法的K均值聚类分析[EB/OL]. 北京：中国科技论文在线 [2012-11-22]. https://www.paper.edu.cn/releasepaper/content/201211-396.

No.****

同行评议

共计0人参与

全部评论

0/1000

论文编号	201211-396
论文题目	不同相似度测量方法的K均值聚类分析
文献类型
收录期刊	上传封面中文期刊英文期刊期刊名称（中文）期刊名称（英文）年，卷（）上传封面中文专著英文专著书名（中文）书名（英文）出版地出版社出版年上传封面中文译著英文译著书名（中文）书名（英文）出版地出版社出版年上传封面中文论文集英文论文集编者.论文集名称（中文） [c]. 出版地出版社出版年， - 编者.论文集名称（英文） [c]. 出版地出版社出版年，- 上传封面中文文献英文文献期刊名称（中文）期刊名称（英文）日期-- 在线地址http:// 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期-- 上传封面中文文献英文文献文题（中文）文题（英文）出版地出版社,出版日期--
英文作者写法：中外文作者均姓前名后，姓大写，名的第一个字母大写，姓全称写出，名可只写第一个字母，其后不加实心圆点“.”, 作者之间用逗号“，”分隔，最后为实心圆点“.”, 示例1：原姓名写法：Albert Einstein,编入参考文献时写法：Einstein A. 示例2：原姓名写法：李时珍；编入参考文献时写法：LI S Z. 示例3：YELLAND R L,JONES S C,EASTON K S,et al.