基于自动查询扩展的专利文档检索方法
首发时间:2013-04-17
摘要:针对现有专利检索中的用户意图理解及查询扩展不足问题,提出了一种基于自动查询扩展的专利文档检索方法。首先结合专利文档特点,采用基于改进TF-IDF公式的专利领域词表提取方法,构建专利领域词表。在检索阶段,对查询输入串进行分析得到查询关键词汇,同领域词表相结合,确定查询所在领域及查询扩展难度。利用基于伪相关反馈的自动查询扩展技术,根据伪相关文档的术语分布差异分析,生成查询扩展项并排序,最后将扩展项与原始查询条件相结合,重新组成查询条件,完成专利查询。实验结果表明,该方法具有较高的召回率和平均准确率。
关键词: 人工智能 专利检索 领域词表 查询扩展 伪相关反馈
For information in English, please click here
A Patent Retrieval Method based on Automatic Query Expansion
Abstract:Existing patent retrieval methods cannot effectively capture user's query intents due to the lack in query expansion. To solve this problem, this paper propose a novel patent retrieval method based on automatic query expansion. Considering the characteristics of patent documents, an improved TF-IDF scheme is first adopted to extract patent domain terms and build the domain vocabularies. At the retrieval stage, query inputs are analyzed to extract key words, and then the field of query and the difficulty of query expansion are determined based on domain vocabularies. Furthermore, according to the term distribution variation analysis on pseudo related documents, the pseudo relevance feedback (PRF)-based automatic query expansion techniques are utilized to generate and rank the candidate expansion terms. At last, the expansion terms are combined with original query conditions to compose the final query conditions for searching. The comparative experiment results show that our method achieves better recall and average precision.
Keywords: Artificial intelligence Patent retrieval Domain vocabulary Query expansion PRF
基金:
论文图表:
引用
No.****
同行评议
共计0人参与
勘误表
基于自动查询扩展的专利文档检索方法
评论
全部评论0/1000