您当前所在位置: 首页 > 学者
在线提示

恭喜!关注成功

在线提示

确认取消关注该学者?

邀请同行关闭

只需输入对方姓名和电子邮箱,就可以邀请你的同行加入中国科技论文在线。

真实姓名:

电子邮件:

尊敬的

我诚挚的邀请你加入中国科技论文在线,点击

链接,进入网站进行注册。

添加个性化留言

已为您找到该学者21条结果 成果回收站

上传时间

2005年10月31日

【期刊论文】Evaluating Document-to-Document Relevance Based on Document Language Model: Modeling, Implementation and Performance Evaluation∗

于戈, Ge Yu, Xiaoguang Li, Yubin Bao, and Daling Wang

CICLing 2005, LNCS 3406, pp. 593-603, 2005.,-0001,():

-1年11月30日

摘要

To evaluate document-to-document relevance is very important to many advanced applications such as IR, text mining and natural language processing. Since it is very hard to define document relevance in a mathematic way on account of users' uncertainty, the concept of topical relevance is widely accepted by most of research fields. It suggests that a document relevance model should explain whether the document representation describes its topical contents and the matching method reveals the topical differences among the documents. However, the current document-to-document relevance models, such as vector space model, string distance, don't put explicitly emphasis on the perspective of topical relevance. This paper exploits a document language model to represent the document topical content and explains why it can reveal the document topics and then establishes two distributional similarity measure based on the document language model to evaluate document-to-document relevance. The experiment on the TREC testing collection is made to compare it with the vector space model, and the results show that the Kullback-Leibler divergence measure with Jelinek-Mercer smoothing outperforms the vector space model significantly.

合作学者

  • 于戈 邀请

    东北大学,辽宁

    尚未开通主页