中国科技论文在线

上传时间

2008年03月21日

【期刊论文】A MIXED APPROACH TO SPOKEN LANGUAGE UNDERSTANDING

刘建毅， Jianyi Liu and Cong Wang

，-0001，（）：

-1年11月30日

A Natural User Interface (NUI), where a user can type or speak a request, is a good complement to the well-known Graphical User Interface (GUI). Accurately extracting user intent from such typed or spoken queries is a very difficult challenge. Statistical and knowledge-based are the two opposite kinds of possible approaches. Both of them have advantages and disadvantages. This paper presents a mixed approach to spoken language understanding that tries to make best use of the both algorithms. The method was test with real data from users, and resulted in a task error rate of 1.94% and a semantic concept error rate of 5.73%.

Spoken language understanding， Statistical Classification， Grammar-based Parsing

64浏览
0点赞
0收藏
0分享
140下载
0

引用

上传时间

2008年03月21日

【期刊论文】Keyword Extraction Using Language Network

刘建毅， Jianyi Liu， Jinghua Wang

，-0001，（）：

-1年11月30日

摘要

In this paper, we introduced language network and described three kinds of networks. Keyword extraction is an important technology in many areas of document processing. In particularly, a keyword extraction algorithm based on language network and PageRank is proposed. Firstly a semantic network for a single document is build, then Pagerank is applied in the network to decide on the importance of a word, finally top-ranked words are selected as keywords of the document. The algorithm is tested on the corpus of CISTR, and the experiment result proves practical and effective.

90浏览
0点赞
0收藏
0分享
371下载
0

引用

上传时间

2008年03月21日

【期刊论文】N-BEST SPEECH HYPOTHESIS REORDERING BASED ON COMPREHENSIVE INFORMATION THEORY

刘建毅， Jianyi Liu， Yixin Zhong

，-0001，（）：

-1年11月30日

摘要

This paper proposes a hypothesis reordering technique, based on a newly established theory, namely Comprehensive Information Theory, to improve the accuracy of speech recognition in a man-machine dialog system. For each hypothesis, we calculate the amount of comprehensive information that hypothesis provided and then reorder N-best hypothesis according to the amount of comprehensive information. Results of experiments have shown its effectiveness.

46浏览
0点赞
0收藏
0分享
115下载
0

引用

上传时间

2008年03月21日

【期刊论文】New word identification based on statistical classifier

刘建毅， LIU Jian-yi， WANG Jing-hua， Wang Cong

THE JOURNAL OF CHINA UNIVERSITIES OF POSTS AND TELECOMMUNICATIONS Volume 13, Issue 3, September 2006，-0001，（）：

-1年11月30日

摘要

New word identification is a difficult point in Chinese word segmentation processing. In the automatic word segmentation processing of large Chinese texts, new word can cause segmentation mistakes. The paper defines new word identification as a binary classification problem: whether a character sequence in certain context is a new word or not, and use two statistical learning approaches based on support vector machine (SVM) and C4.5. We then investigate various linguistic and statistical features including Independent Word Probability of former character, Independent Word Probability of latter character, front position In-word probability of former character, back position In-word probability of latter character, Mutual Information and frequency. In PK-close test of the 1st Special Interest Group for Chinese Language Processing (SIGHAN) bakeoff, this approach achieves the high precision and recall.

new word identification， support vector machine， decision tree

48浏览
0点赞
0收藏
0分享
174下载
0

引用

上传时间

2008年03月21日

【期刊论文】Research on Text Network Representation

刘建毅， Jianyi Liu， Jinghua Wang， and Cong Wang

，-0001，（）：

-1年11月30日

摘要

Text representation is the basis of text processing. Most current text representation model didn’t consider of the words’ relations and result in the loss of text’s structure information, which is important to understand the text. This paper proposed a novel text representation model, which uses lexical network to represent the text and retains the text’s structure. According to the different levels of words’ relations, co-occurrence network, syntactic network and semantic network are introduced. The text network representation was applied into text classification to measure the representation ability of this model. The experiment result shows that our text network representation is prior to vector space model.

116浏览
0点赞
0收藏
0分享
188下载
0

引用