一种用于微博聚类的K-means改进算法
首发时间:2012-05-17
摘要:随着信息技术的不断发展,出现了许多新型的信息媒介,微博就是其中之一。由于微博所具有的许多特性,对微博内容分析挖掘的重要性日益显著。本文对微博的统计特性进行分析,基于该分析对微博文本进行建模,并提出一种K-means的初始化方法,基于该K-means改进算法完成对微博文本内容的聚类,并对聚类结果进行分析,论证该改进算法的有效性。
关键词: 文本聚类 K-means 算法 微博 特征选择
For information in English, please click here
An improved K-means algorithm for microblog clustering
Abstract:Along with information technology's development, a lot of new information mediums appear, microblog is one of them. Because of microblog has many features, the analysis and mining of the contents of the microblog become increasingly important. This paper analyzed the statistical properties of the microblog and made a model based on the analysis. Using the model, we proposed an initialized method for K-means algorithm and completed the clustering of microblog using this improved K-means algorithm. Furthermore, we analyzed the results demonstrating the validity and rationality of this improved algorithm.
Keywords: text clustering K-means algorithm microblog feature selection
基金:
论文图表:
引用
No.****
同行评议
共计0人参与
勘误表
一种用于微博聚类的K-means改进算法
评论
全部评论0/1000