周傲英
数据管理与信息系统,包括:Web数据管理、中文Web基础设施、Web搜索与挖掘;数据流与数据挖掘、复杂事件处理与实时商务智能、不确定数据管理及其应用;数据密集的计算、分布存储与计算、对等计算系统及其数据管理、Web服务计算
个性化签名
- 姓名:周傲英
- 目前身份:
- 担任导师情况:
- 学位:
-
学术头衔:
博士生导师, 国家杰出青年科学基金获得者
- 职称:-
-
学科领域:
计算机科学技术
- 研究兴趣:数据管理与信息系统,包括:Web数据管理、中文Web基础设施、Web搜索与挖掘;数据流与数据挖掘、复杂事件处理与实时商务智能、不确定数据管理及其应用;数据密集的计算、分布存储与计算、对等计算系统及其数据管理、Web服务计算
周傲英 教授,软件学院常务副院长
研究方向:
数据管理与信息系统‚包括:Web数据管理、中文Web基础设施、Web搜索与挖掘;数据流与数据挖掘、复杂事件处理与实时商务智能、不确定数据管理及其应用;数据密集的计算、分布存储与计算、对等计算系统及其数据管理、Web服务计算
社会兼职:
教育部计算机科学与技术专业教学指导委员会委员
中国计算机学会数据库专业委员会委员、副主任
上海计算机学会数据库专业委员会副主任
ACM SIGMOD China副主席
中国计算机学会青年科技论坛(YOCSEF)学术委员会荣誉委员
《International Journal on Very Large Data Bases》编委
《Journal of Computer Science and Technology》 编委
《Frontier of Computer Science in China 》编委
《International Journal of Software and Informatics》编委
《计算机学报》编委
《计算机科学与探索》编委
ICDE’2005 (demo)、ICDCS’2005、VLDB’2005、EDBT’2006、SIGMOD’2007/2008、SIGIR’2007/2008、SIGMOD’2008、WWW’2008/2009等国际学术会议的程序委员。
ICDE´2009 PC Vice-Chair
WAIM’2000国际会议程序委员会主席‚ER’04的大会主席。
-
主页访问
1649
-
关注数
0
-
成果阅读
888
-
成果数
17
【期刊论文】Distributed Data Stream Clustering: A Fast EM-based Approach
周傲英, Aoying Zhou§ Feng Cao§ Ying Yan§ Chaofeng Sha§ Xiaofeng He†‡
,-0001,():
-1年11月30日
Clustering data streams has been attracting a lot of research efforts recently. However, this problem has not received enough consideration when the data streams are generated in a distributed fashion, whereas such a scenario is very common in real life applications. There exist constraining factors in clustering the data streams in the distributed environment: the data records generated are noisy or incomplete due to the unreliable distributed system; the system needs to on-line process a huge volume of data; the communication is potentially a bottleneck of the system. All these factors pose great challenge for clustering the distributed data streams. In this paper, we proposed an EM-based (Expectation Maximization) framework to effectively cluster the distributed data streams, with the above fundamental challenges in mind. In the presence of noisy or incomplete data records, our algorithms learn the distribution of underlying data streams by maximizing the likelihood of the data clusters. A test-and-cluster strategy is proposed to reduce the average processing cost, which is especially effective for online clustering over large data streams. Our extensive experimental studies show that the proposed algorithms can achieve a high accuracy with less communication cost, memory consumption and CPU time.
-
65浏览
-
0点赞
-
0收藏
-
0分享
-
97下载
-
0评论
-
引用
【期刊论文】Sonnet: An Efficient Distributed Content-based Dissemination Broker
周傲英, Aoying Zhou†, Weining Qian‡, Xueqing Gong†, and Minqi Zhou†
SIGMOD’07, June 12-14, 2007,-0001,():
-1年11月30日
In this demonstration, we present a prototype content-based dissemination broker, called Sonnet, which is built upon structured overlay network. It combines approximate filtering of XML packets with routing in the overlay network. Deliberate optimization technologies are implemented. The running and tracing of the system in a real-life application are to be emonstrated.
Distributed publish/, subscribe,, XML data dissemination,, approximate filtering,, path digest
-
44浏览
-
0点赞
-
0收藏
-
0分享
-
103下载
-
0评论
-
引用
【期刊论文】Adaptive Probabilistic Search Over Unstructured Peer-to-Peer Computing Systems
周傲英, Aoying Zhou & Linhao Xu & Chenyun Dai
World Wide Web (2006) 9: 537-556,-0001,():
-1年11月30日
A challenging problem that confronts unstructured peer-to-peer (P2P) computing systems is how to provide efficient support to locate desired files. This paper addresses this problem by using some quantitative information in the form of probabilistic knowledge. Two types of probabilistic knowledge are considered in this paper: overlap between topics shared in the network and coverage of topics at each individual peer. Based on the probabilistic knowledge, this paper proposes an adaptive probabilistic search algorithm that can efficiently support file locating operation in the unstructured P2P network. Then, an update algorithm is devised to keep the freshness of the probabilistic knowledge of individual peers by taking advantage of feedback from the previous user queries. Finally, some extensive experiments are conducted to evaluate the fficiency and effectiveness of the proposed method.
P2P computing., probabilistic search., query routing
-
60浏览
-
0点赞
-
0收藏
-
0分享
-
106下载
-
0评论
-
引用
周傲英, Jeffrey Xu Yui, Zhihong Chong, Hongjun Lu, Aoying Zhou
,-0001,():
-1年11月30日
The problem of finding frequent items has been recently studied over high speed datastreams. However, mining frequent iteinsetsfroIn transactional data streams has not beenwell addressed yet in terms of its bounds ofmemory consumption. The main difficulty isdue to the nature of the exponential explo-sion of itemsets. Given a domain of uniqueitems, the possible number of itemsets can beup to 2i-i. When the length of data stremsapproaches to a very large number N, thepossibility of an itemset to be frequent be-comes larger and difficult to track with lim-ited memory. However. the real killer of ef-fective frequent itemset mining is that mostof existing algorithms are false-positive ori-nted. That is, they control memory con-sumption in the counting processes by an er-ror arameter e, aud allow items with sup-port below the specified minimum support s but above s-e counted as frequent ones. Such false-positive items increase the num- ber of false-positive frequent itemsets expo-nentially, which rn, make the problem com- putationally intractable with bounded mem-ory consumption. In this paper, we developed algorithms that can effectively mine fl'equent item(set)s from high speed transactional datastreams with a bound of memory consump-tion. While our algorithms are false-negative oriented, that is, certain frequent itemsets may not appear in the zesults, the number of false-negative itemsets can be controlled by a predefined parameter so that desired recall rate of frequent itemsets can be guaranteed. We developed algorithms based on Chernoff bound. Our extensive experimental studies
-
121浏览
-
0点赞
-
0收藏
-
0分享
-
59下载
-
0评论
-
引用
周傲英, Jeffrey Xu Yu a, *, Zhihong Chong b, Hongjun Lu c, Zhenjie Zhang d, Aoying Zhou b
Information Sciences, 176 (2006): 1986-2015,-0001,():
-1年11月30日
Mining frequent itemsets from transactional data streams is challenging due to the nature of the exponential explosion of itemsets and the limit memory space required for mining frequent itemsets. Given a domain of I unique items, the possible number of itemsets can be up to 2I-1. When the length of data streams approaches to a very large number N, the possibility of an itemset to be frequent becomes larger and difficult to track with limited memory. The existing studies on finding frequent items from high speed data streams are false-positive oriented. That is, they control memory consumption in the counting processes by an error parameter_, and allow items with support below the specified minimum support s but aboves counted as frequent ones. However, such false-positive oriented approaches cannot be effectively applied to frequent itemsets mining for two reasons. First, false-positive items found increase the number of false-positive frequent itemsets exponentially. Second, minimization of the number of false-positive items found, by using a small, will make memory consumption large. Therefore, such approaches may make the problem computationally intractable with bounded memory consumption. In this paper, we developed algorithms that can effectively mine frequent item(set)s from high speed transactional data streams with a bound of memory consumption. Our algorithms are based on Chernoff bound in which we use a running error parameter to prune item(set)s and use a reliability parameter to control memory. While our algorithms are false-negative oriented, that is, certain frequent itemsets may not appear in the results, the number of false-negative itemsets can be controlled by a predefined parameter so that desired recall rate of frequent itemsets can be guaranteed. Our extensive experimental studies show that the proposed algorithms have high accuracy, require less memory, and consume less CPU time. They significantly outperform the existing false-positive algorithms.
Data stream, Frequent pattern mining, Memory minimization
-
53浏览
-
0点赞
-
0收藏
-
0分享
-
235下载
-
0评论
-
引用
【期刊论文】C2: a new overlay network based on CAN and Chord
周傲英, Wenyuan Cai, Shuigeng Zhou*, Weining Qian and Linhao, Xu Kian-Lee Tan, Aoying Zhou
Int. J. High Performance Computing and Networking, Vol. x, No. x, 200x,-0001,():
-1年11月30日
In this paper, we present C2, a new overlay network based on CAN and Chord. It is primarily designed for a dynamic environment in which peers join and depart the network frequently. For an n-peers C2 system, each peer maintains only about O(log n) of other peers' information, and achieves routing within O(log n) hops. For each peer's joining or departure, C can, in high probability, update the routing tables with no more than O(log n) messages. What distinguishes C2 from many other peer to peer data sharing systems is its low computation cost and its high routing efficiency in a dynamic network. Even in the case that a considerable number of peers fail simultaneously, i.e., several other peers' routing tables are out of date, the average number of hops for successful routing remains acceptable.
Distributed Computing, Peer-to-Peer Computing, Overlay Network, Chord, CAN.,
-
35浏览
-
0点赞
-
0收藏
-
0分享
-
62下载
-
0评论
-
引用
【期刊论文】QoS-Aware Composite Services Retrieval
周傲英, Xiao-Ling Wang, Sheng Huang, and Ao-Ying Zhou
J. Comput. Sci. & Technol. July 2006, Vol. 21, No.4, pp. 547-558,-0001,():
-1年11月30日
For current service oriented applications, individual web service usually cannot nicer the requircments arising from real world applications, so it is necessary to combine the functionMities of different web services to obtain a composite service in response to users' service requests. In order to address the problem of web service composition, this paper proposes an efficient approach to composing basic services in case no any individual service can fully satisfy users' requests. Compared with the gencral strategies adopted in most previously proposed approaches where only the best composition solution is produced, the QoS aware service composition approach is given and top k solutions in the framnework are provided, rather than focusing on obtaining the best composition solution, since the presented approach allows more candidates theft are likely to meet the requirements of the users. The approach is based on a succinct binary trec data structure, and a system, named ATC (Approach to Top h Composite services retrieval) system is implemented. In ATC, QoS is taken into account for composite service, and a heuristic based search method is proposed to retrieve top K composite service. Some extensive experiments are designed and two web service benchmarks are used for performance study. The experimental results show that the proposed approach can assure high precision and efficiency for composite service search.
web service,, service composition,, top /, k retrie, v, a, l,
-
53浏览
-
0点赞
-
0收藏
-
0分享
-
26下载
-
0评论
-
引用
周傲英, 金澈清, 钱卫宁, 周傲英+
软件学报,2004,15(8):1172~1181,-0001,():
-1年11月30日
有关流数据分析与管理的研究是目前国际数据库研究领域的一个热点。在过去30多年中,尽管传统数据库技术发展迅速且得到了广泛应用,但是它不能够处理在诸如网络路由、传感器网络、股票分析等应用中所生成的一种新型数据,即流数据。流数据的特点是数据持续到达,且速度快、规模宏大;其研究核心是设计高效的单遍数据集扫描算法,在一个远小于数据规模的内存空间里不断更新一个代表数据集的结构——概要数据结构使得在任何时候都能够根据这个结构迅速获得近似查询结果。综述国际上关于流数据的概要数据结构生成与维护的研究成果,并通过列举解决流数据上两个重要问题的各种方案来比较各种算法的特点以及优劣。
流数据, 概要数据结构, 界标模型, 滑动窗口模型
-
34浏览
-
0点赞
-
0收藏
-
0分享
-
78下载
-
0评论
-
引用
周傲英, 周傲英), 胥正川), 郭志懋), 周水庚)
计算机学报,2004,27(4):433~441,-0001,():
-1年11月30日
XML管理系统的查询处理效率很大程度上取决于系统中XML数据的存储模式。在用户查询已知或可预测的情况下,根据用户查询设计存储模式可以改善系统的查询处理效率。该文介绍VXMLR系统存储模式的自适应调整机制。根据历史查询信息,VXMLR系统对其存储模式进行自适应调整,从而提高查询处理效率。其基本思路是:首先根据历史查询,推导出适当的映射规则,得到XML文档在关系数据库中的存储模式;然后,在给定的空间约束下,根据历史查询使用背包问题求解算法选择关系表进行垂直分割或冗余存储相关数据,使查询所访问的无关数据尽可能少。VXMLR系统提供四种存储模式调整策略,其中两种策略可以实现自适应的存储模式调整。实验结果表明文中提出的方法是有效的。
XML 数据管理, 存储模式, 自适应模式调整
-
36浏览
-
0点赞
-
0收藏
-
0分享
-
47下载
-
0评论
-
引用
【期刊论文】Data Management in Peer-to-Peer Environment:A Perspective of BestPeer
周傲英, ZHOU AoYing, QIAN V (eiNing, ZHOU ShuiGeng*, LING Be, XU LinHao, Ng wee Siong, Ooi Beng Chin and Tan Kian-Lee
J. Comput. Sci. & Technol. July 2003, Vol. 18, No.4, pp. 452-461,-0001,():
-1年11月30日
Peer to Peer fP2P1 systenls llave attracted nluch attention in academic coinInu nity and industry circles due to their pronlising applications in various domains This paper presents the attthors’research effbrts oil introducing complex query capabilities in a P2P environ inetit consisting ofnuiIlcrouspeerswithlarge voluine of'data An underlyinghybridP2P computing platfomn. nalncd BestPeer is described first Thc contlection anloIlg peers within BestPeer is sel5 configurable through nlaintaining thc nearest neighbor of peers, and the agent techniques employed in thc system cIisurc its capability of providing sophisticated scrvices The designs of three P2P data Inanageinent systems which&re all based on BestPeer are described in detail Thcy provide support tbr infbrnlation retrieval, query processing and Web services respectively. Advantages and limitations are discussed. while ongoing work is presented Cttrrent systeillS C[IIl provide basic functions for keyword based search. SQL like query processing. and Wcb services querying and discovery Some further topics on providing fully fledgcd data Illanagelnent functionalities for P2P distributed conlputing systems with security gtlal'&ntee are also discussed
peer to peer conlputing., BestPeer, data iilanagolncIlt,, information retrie, v, a, l, ,, Web servicc
-
56浏览
-
0点赞
-
0收藏
-
0分享
-
29下载
-
0评论
-
引用