您当前所在位置: 首页 > 学者
在线提示

恭喜!关注成功

在线提示

确认取消关注该学者?

邀请同行关闭

只需输入对方姓名和电子邮箱,就可以邀请你的同行加入中国科技论文在线。

真实姓名:

电子邮件:

尊敬的

我诚挚的邀请你加入中国科技论文在线,点击

链接,进入网站进行注册。

添加个性化留言

已为您找到该学者21条结果 成果回收站

上传时间

2005年10月31日

【期刊论文】XML数据的路径表达式查询优化技术*

于戈, 吕建华†, 王国仁

软件学报,2003,14(9):1615~1620,-0001,():

-1年11月30日

摘要

路径表达式作为XML数据查询语言的核心部分,关于它的计算方法的研究成果已有很多,然而针对路径表达式本身进行优化的研究却相对较少。提出了两种针对路径表达式的优化策略:路径缩短策略和补路径策略,从而提高了XML路径查询效率。路径缩短策略根据XML文档模式信息,将路径表达式查询长度缩短,从而简化查询本身以降低需要的查询代价;而补路径策略则试图使用代价更小的等价路径表达式来替换原始查询。经过对实验数据的分析,这两种优化策略对于绝大多数路径表达式查询可以应用,并可大幅度地改进路径表达式的查询性能。

XML, 路径表达式, 查询处理, 查询代价, 查询优化

上传时间

2005年10月31日

【期刊论文】Evaluating Document-to-Document Relevance Based on Document Language Model: Modeling, Implementation and Performance Evaluation∗

于戈, Ge Yu, Xiaoguang Li, Yubin Bao, and Daling Wang

CICLing 2005, LNCS 3406, pp. 593-603, 2005.,-0001,():

-1年11月30日

摘要

To evaluate document-to-document relevance is very important to many advanced applications such as IR, text mining and natural language processing. Since it is very hard to define document relevance in a mathematic way on account of users' uncertainty, the concept of topical relevance is widely accepted by most of research fields. It suggests that a document relevance model should explain whether the document representation describes its topical contents and the matching method reveals the topical differences among the documents. However, the current document-to-document relevance models, such as vector space model, string distance, don't put explicitly emphasis on the perspective of topical relevance. This paper exploits a document language model to represent the document topical content and explains why it can reveal the document topics and then establishes two distributional similarity measure based on the document language model to evaluate document-to-document relevance. The experiment on the TREC testing collection is made to compare it with the vector space model, and the results show that the Kullback-Leibler divergence measure with Jelinek-Mercer smoothing outperforms the vector space model significantly.

上传时间

2005年10月31日

【期刊论文】MMPClust: A Skew Prevention Algorithm for Model-Based Document Clustering*

于戈, Xiaoguang Li, Ge Yu, and Daling Wang

DASFAA 2005, LNCS 3453, pp. 536-547, 2005.,-0001,():

-1年11月30日

摘要

To support very high dimensionality, model-based clustering is an intuitive choice for document clustering. However, the current model-based algorithms are prone to generating the skewed clusters, which influence the quality of clustering seriously. In this paper, the reasons of skew are examined and determined as the inappropriate initial model, the unfitness of cluster model and the interaction between the decentralization of estimation samples and the over-generalized cluster model. This paper proposes a skew prevention document-clustering algorithm (MMPClust), which has two features: (1) a content-based cluster model is used to model the cluster better; (2) at the re-estimation step, a part of documents most relevant to its corresponding class are selected automatically for each cluster as the estimation samples to break this interaction. MMPClust has less restrictions and more applicability in document clustering than the previous methods.

上传时间

2005年10月31日

【期刊论文】Efficiently Mapping Integrity Constraints from Relational Database to XML Document1

于戈, Xiaochun Yang, Ge Yu, and Guoren Wang

ADBIS 2001, LNCS 2151, pp. 338-351, 2001.,-0001,():

-1年11月30日

摘要

XML is rapidly emerging as the dominant standard for exchanging data on the WWW. Most of application data are stored in relational databases due to its popularity and rich development experiences over it. Therefore, how to provide a proper mapping approach from relational data to XML documents becomes an important topic. Integrity constraints are useful for semantic specification that plays the important roles in relation schema definition. The existing XML schema language does not define general constraints and maintaining method for integrity constraints. So how to use XML to express and maintain integrity constraints especially the advanced integrity constraints, e.g., general constraints of relational data is one of challenge research issues. In this paper, a novel mapping approach is proposed to map relation data to XML document with active nodes, XMLA, and extended DTD with constraints, DTDC. The ability to maintain integrity constraints makes our approach more effective than other approaches.

上传时间

2005年10月31日

【期刊论文】An Efficient Iterative Optimization Algorithm for Image Thresholding

于戈, Liju Dong, and Ge Yu

CIS 2004, LNCS 3314, pp. 1079-1085, 2004.,-0001,():

-1年11月30日

摘要

Image thresholding is one of the main techniques for image segmentation. It has many applications in pattern recognition, computer vision, and image and video understanding. This paper formulates the thresholding as an optimization problem: finding the best thresholds that minimize a weighted sum-of-squared-error function. A fast iterative optimization algorithm is presented to reach this goal. Our algorithm is compared with a classic, most commonly-used thresholding approach. Both theoretic analysis and experiments show that the two approaches are equivalent. However, our formulation of the problem allows us to develop a much more efficient algorithm, which has more applications, especially in real-time video surveillance and tracking systems.

合作学者

  • 于戈 邀请

    东北大学,辽宁

    尚未开通主页