开放存取资源文本挖掘平台
首发时间:2014-05-20
摘要:近年来,随着开放存取资源得到广泛认同,越来越多的科技文献以该方式面向读者。然而,开放存取资源文本数量的快速增长带来了信息淹没问题。为解决此问题,本文首先分析了国内外著名的文本挖掘工具,在此基础上我们搭建了开放存取资源文本挖掘平台。该平台具有数据收集、文本预处理、相似性度量计算以及文本聚类四个模块,实现了开放存取资源从自动获取到聚类分析的功能,为开放存取资源知识发现提供了有力的系统工具。
For information in English, please click here
Text Mining Platform on Open Access Resources
Abstract:Recently, more and more scientific manuscripts choose to publish on Open Access journals. And publishing paper in this way has been accepted globally. However, useful information and knowledge are drown due to the fast increase of Open Access articles. To solve this problem, in this paper, we first make a brief analysis for the popular text mining tools. Then we develop an Open Access text mining platform based on the analysis. It makes of four modules: data acquisition, text preprocessing, similarity measurement and text clustering. This platform contains the main text mining steps from auto acquisition to clustering analysis for Open Access data. It therefore could be a systematic tool for Open Access knowledge discovery.
Keywords: Machine Learning Open Access Text Clustering Text Mining
论文图表:
引用
No.4597109977816139****
同行评议
共计0人参与
勘误表
开放存取资源文本挖掘平台
评论
全部评论0/1000