An Optimization Model for Outlier Detection in Categorical Data
首发时间:2005-03-31
Abstract:The task of outlier detection is to find small groups of data objects that are exceptional when compared with rest large amount of data. Detection of such outliers is important for many applications such as fraud detection and customer migration. Most existing methods are designed for numeric data. They will encounter problems with real-life applications that contain categorical data. In this paper, we formally define the problem of outlier detection in categorical data as an optimization problem from a global viewpoint. Moreover, we present a local-search heuristic based algorithm for efficiently finding feasible solutions. Experimental results on real datasets and large synthetic datasets demonstrate the superiority of our model and algorithm.
keywords: Outlier, Optimization, Local Search, Entropy, Data Mining
点击查看论文中文信息
An Optimization Model for Outlier Detection in Categorical Data
摘要:The task of outlier detection is to find small groups of data objects that are exceptional when compared with rest large amount of data. Detection of such outliers is important for many applications such as fraud detection and customer migration. Most existing methods are designed for numeric data. They will encounter problems with real-life applications that contain categorical data. In this paper, we formally define the problem of outlier detection in categorical data as an optimization problem from a global viewpoint. Moreover, we present a local-search heuristic based algorithm for efficiently finding feasible solutions. Experimental results on real datasets and large synthetic datasets demonstrate the superiority of our model and algorithm.
关键词: Outlier, Optimization, Local Search, Entropy, Data Mining.
基金:
论文图表:
引用
No.1768134111112253****
同行评议
共计0人参与
勘误表
An Optimization Model for Outlier Detection in Categorical Data
评论
全部评论0/1000