您当前所在位置: 首页 > 学者

于戈

  • 68浏览

  • 0点赞

  • 0收藏

  • 0分享

  • 42下载

  • 0评论

  • 引用

期刊论文

Evaluating Document-to-Document Relevance Based on Document Language Model: Modeling, Implementation and Performance Evaluation∗

于戈Ge Yu Xiaoguang Li Yubin Bao and Daling Wang

CICLing 2005, LNCS 3406, pp. 593-603, 2005.,-0001,():

URL:

摘要/描述

To evaluate document-to-document relevance is very important to many advanced applications such as IR, text mining and natural language processing. Since it is very hard to define document relevance in a mathematic way on account of users' uncertainty, the concept of topical relevance is widely accepted by most of research fields. It suggests that a document relevance model should explain whether the document representation describes its topical contents and the matching method reveals the topical differences among the documents. However, the current document-to-document relevance models, such as vector space model, string distance, don't put explicitly emphasis on the perspective of topical relevance. This paper exploits a document language model to represent the document topical content and explains why it can reveal the document topics and then establishes two distributional similarity measure based on the document language model to evaluate document-to-document relevance. The experiment on the TREC testing collection is made to compare it with the vector space model, and the results show that the Kullback-Leibler divergence measure with Jelinek-Mercer smoothing outperforms the vector space model significantly.

关键词:

【免责声明】以下全部内容由[于戈]上传于[2005年10月31日 22时52分52秒],版权归原创者所有。本文仅代表作者本人观点,与本网站无关。本网站对文中陈述、观点判断保持中立,不对所包含内容的准确性、可靠性或完整性提供任何明示或暗示的保证。请读者仅作参考,并请自行承担全部责任。

我要评论

全部评论 0

本学者其他成果

    同领域成果