Home / Topics

Outlier and Anomaly Detection

Curated by: Varun Chandola and Vipin Kumar

Anomalies are the unusual, unexpected, surprising patterns in the observed world. Identifying, understanding, and predicting anomalies from data form one of the key pillars of modern data mining. Ective detection of anomalies allows extracting critical information from data which can then be used for a variety of applications, such as to stop malicious intruders, detect and repair faults in complex systems, and better understand the behavior of natural, social, and engineered systems.

Anomaly detection refers to the problem of ending anomalies in data. While anomaly is a generally accepted term, other synonyms, such as outliers, discordant observations, exceptions, aberrations, surprises, peculiarities or contaminants, are often used in different application domains. In particular, anomalies and outliers are often used interchangeably. Anomaly detection finds extensive use in a wide variety of applications such as fraud detection for credit cards, insurance or health care, intrusion detection for cyber-security, fault detection in safety critical systems, and military surveillance for enemy activities. The importance of anomaly detection stems from the fact that for a variety of application domains anomalies in data often translate to significant (and often critical) actionable insights. For example, an anomalous traffic pattern in a computer network could mean that a hacked computer is sending out sensitive data to an unauthorized destination. An anomalous remotely sensed weather variable such as temperature could imply a heat wave or cold snap, or even faulty remote sensing equipment. An anomalous MRI image may indicate early signs of Alzheimer’s or presence of malignant tumors. Anomalies in credit card transaction data could indicate credit card or identity theft or anomalous readings from a space craft sensor could signify a fault in some component of the space craft.

Important links:

1. Anomaly Detection: A Survey, Varun Chandola, Arindam Banerjee and Vipin Kumar, ACM Computing Surveys (http://dl.acm.org/citation.cfm?id=1541882)

2. Outlier Analysis, Charu Aggarwal, Springer (http://www.amazon.com/Outlier-Analysis-Charu- C-Aggarwal/dp/1461463955)

3. Anomaly Detection: A Tutorial, Sanjay Chawla and Varun Chandola, ICDM 2011


4. Data Mining for Anomaly Detection, Tutorial at ECML PKDD 2008


Related KDD2016 Papers

Title & Authors
Semi-Markov Switching Vector Autoregressive Model-based Anomaly Detection in Aviation Systems
Author(s): Igor Melnyk*, University of Minnesota; Arindam Banerjee, University of Minnesota; Bryan Matthews, Nasa Ames Research Center; Nikunj Oza, Nasa Ames Research Center
Catch Me If You Can: Detecting Pickpocket Suspects from Large-Scale Transit Records
Author(s): Bowen Du*, Beihang University; Chuanren Liu, Drexel University; Wenjun Zhou, U of Tennessee; Hui Xiong, Rutgers
Assessing Human Error Against a Benchmark of Perfection
Author(s): Ashton Anderson*, Stanford University; Jon Kleinberg, Cornell University; Sendhil Mullainathan, Harvard
Modeling Precursors for Event Forecasting via Nested Multi-Instance Learning
Author(s): Yue Ning*, Virginia Tech; Sathappan Muthiah, Virginia Tech; Huzefa Rangwala, George Mason University; Naren Ramakrishnan, Virginia Tech