已为您找到该学者10条结果 成果回收站
【期刊论文】Arnetminer: expertise oriented search using social networks
唐杰, Juanzi LI, Jie TANG, Jing ZHANG, Qiong LUO, Yunhao LIU, Mingcai HONG
Front. Comput. Sci. China ,-0001,():
-1年11月30日
Expertise Oriented Search (EOS) aims at providing comprehensive expertise analysis on data from distributed sources. It is useful in many application domains, for example, finding experts on a given topic, detecting the confliction of interest between researchers, and assigning reviewers to proposals. In this paper, we present the design and implementation of our expertise oriented search system, Arnetminer (http: //www.arnetminer.net). Arnetminer has gathered and integrated information about a half-million computer science researchers from the Web, including their profiles and publications. Moreover, Arnetminer constructs a social network among these researchers through their co-authorship, and utilizes this network information as well as the individual profiles to facilitate expertise oriented search tasks. In particular, the co-authorship information is used both in ranking the expertise of individual researchers for a given topic and in searching for associations between researchers. We have conducted initial experiments on Arnetminer. Our results demonstrate that the proposed relevancy propagation expert finding method outperforms the method that only uses person local information, and the proposed twostage association search on a large-scale social network is orders of magnitude faster than the baseline method.
social network, expertise search, association search
-
100浏览
-
0点赞
-
0收藏
-
0分享
-
284下载
-
0
-
引用
【期刊论文】1iASA: Learning to Annotate the Semantic Web
唐杰, Jie Tang, Juanzi Li, Hongjun Lu, Bangyong Liang, Xiaotong Huang, Kehong Wang
,-0001,():
-1年11月30日
With the advent of the Semantic Web, there is a great need to upgrade existing web content to semantic web content. This can be accomplished through semantic annotations. Unfortunately, manual annotation is tedious, time consuming and error-prone. In this paper, we propose a tool, called iASA, that learns to automatically annotate web documents according to an ontology. iASA is based on the combination of information extraction (specifically, the Similarity-based Rule Learner—SRL) and machine learning techniques. Using linguistic knowledge and optimal dynamic window size, SRL produces annotation rules of better quality than comparable semantic annotation systems. Similarity-based learning efficiently reduces the search space by avoiding pseudo rule generalization. In the annotation phase, iASA exploits ontology knowledge to refine the annotation it proposes. Moreover, our annotation algorithm exploits machine learning methods to correctly select instances and to predict missing instances. Finally, iASA provides an explanation component that explains the nature of the learner and annotator to the user. Explanations can greatly help users understand the rule induction and annotation process, so that they can focus on correcting rules and annotations quickly. Experimental results show that iASA can reach high accuracy quickly.
-
63浏览
-
0点赞
-
0收藏
-
0分享
-
135下载
-
0
-
引用
【期刊论文】Tree-structured Conditional Random Fields for Semantic Annotation
唐杰, Jie Tang, Mingcai Hong, Juanzi Li, and Bangyong Liang
,-0001,():
-1年11月30日
The large volume of web content needs to be annotated by ontologies (called Semantic Annotation), and our empirical study shows that strong dependencies exist across different types of information (it means that identification of one kind of information can be used for identifying the other kind of information). Conditional Random Fields (CRFs) are the state-of-the-art approaches for modeling the dependencies to do better annotation. However, as information on a Web page is not necessarily linearly laid-out, the previous linear-chain CRFs have their limitations in semantic annotation. This paper is concerned with semantic annotation on hierarchically dependent data (hierarchical semantic annotation). We propose a Tree-structured Conditional Random Field (TCRF) model to better incorporate dependencies across the hierarchically laid-out information. Methods for performing the tasks of model-parameter estimation and annotation in TCRFs have been proposed. Experimental results indicate that the proposed TCRFs for hierarchical semantic annotation can significantly outperform the existing linear-chain CRF model.
-
63浏览
-
0点赞
-
0收藏
-
0分享
-
106下载
-
0
-
引用
【期刊论文】Using Bayesian decision for ontology mapping
唐杰, Jie Tang, Juanzi Li, Bangyong Liang, Xiaotong Huang, Yi Li, Kehong Wang
Web Semantics: Science, Services and Agents on the World Wide Web 4 (2006) 243–262,-0001,():
-1年11月30日
Ontology mapping is the key point to reach interoperability over ontologies. In semantic web environment, ontologies are usually distributed and heterogeneous and thus it is necessary to find the mapping between them before processing across them. Many efforts have been conducted to automate the discovery of ontology mapping. However, some problems are still evident. In this paper, ontology mapping is formalized as a problem of decision making. In this way, discovery of optimal mapping is cast as finding the decision with minimal risk. An approach called Risk Minimization based Ontology Mapping (RiMOM) is proposed, which automates the process of discoveries on 1: 1, n: 1, 1: null and null: 1 mappings. Based on the techniques of normalization and NLP, the problem of instance heterogeneity in ontology mapping is resolved to a certain extent. To deal with the problem of name conflict in mapping process, we use thesaurus and statistical technique. Experimental results indicate that the proposed method can significantly outperform the baseline methods, and also obtains improvement over the existing methods.
Ontology mapping, Semantic web, Bayesian decision, Ontology interoperability
-
43浏览
-
0点赞
-
0收藏
-
0分享
-
121下载
-
0
-
引用
【期刊论文】Chapter I Information Extraction: Methodologies and Applications
唐杰, Jie Tang, Mingcai Hong, Duo Zhang, Juanzi Li, Bangyong Liang
,-0001,():
-1年11月30日
This chapter is concerned with the methodologies and applications of information extraction. Information is hidden in the large volume of Web pages and thus it is necessary to extract useful information from the Web content, called information extraction. In information extraction, given a sequence of instances, we identify and pull out a subsequence of the input that represents information we are interested in. In the past years, there was a rapid expansion of activities in the information extraction area. Many methods have been proposed for automating the process of extraction. However, due to the heterogeneity and the lack of structure of Web data, automated discovery of targeted or unexpected knowledge information still presents many challenging research problems. In this chapter, we will investigate the problems of information extraction and survey existing methodologies for solving these problems. Several real-world applications of information extraction will be introduced. Emerging challenges will be discussed.
-
39浏览
-
0点赞
-
0收藏
-
0分享
-
285下载
-
0
-
引用