Home / Topics

Frequent Pattern Mining

Curated by: Xifeng Yan


Frequent patterns are itemsets, subsequences, or substructures that appear in a data set with frequency no less than a user-specified threshold. For example, a set of items, such as milk and bread, that appear frequently together in a transaction data set, is a frequent itemset. A subsequence, such as buying first a PC, then a digital camera, and then a memory card, if it occurs frequently in a shopping history database, is a (frequent) sequential pattern. A substructure can refer to different structural forms, such as subgraphs, subtrees, or sublattices, which may be combined with itemsets or subsequences. If a substructure occurs frequently in a graph database, it is called a (frequent) structural pattern. Finding frequent patterns plays an essential role in mining associations, correlations, and many other interesting relationships among data. Moreover, it helps in data indexing, classification, clustering, and other data mining tasks as well. Frequent pattern mining is an important data mining task and a focused theme in data mining research. Abundant literature has been dedicated to this research and tremendous progress has been made, ranging from efficient and scalable algorithms for frequent itemset mining in transaction databases to numerous research frontiers, such as sequential pattern mining, structured pattern mining, correlation mining, associative classification, and frequent pattern-based clustering, as well as their broad applications [1]. A few text books are available on this topic, e.g., [2].

[1] Frequent Pattern Mining: Current Status and Future Directions, by J. Han, H. Cheng, D. Xin and X. Yan, 2007 Data Mining and Knowledge Discovery archive, Vol. 15 Issue 1, pp. 55 – 86, 2007

[2] Frequent Pattern Mining, Ed. Charu Aggarwal and Jiawei Han, Springer, 2014.


Related KDD2016 Papers

Title & Authors
Online Feature Selection: A Limited-Memory Substitution Algorithm and its Asynchronous Parallel Vari
Author(s): Haichuan Yang*, University of Rochester; Ryohei Fujimaki, NEC Laboratories America; Yukitaka Kusumura, NEC lab; Ji Liu, University of Rochester
DeepIntent: Learning Attentions for Online Advertising with Recurrent Neural Networks
Author(s): Shuangfei Zhai*, Binghamton University; Keng-hao Chang, Microsoft; Ruofei Zhang, Microsoft; Zhongfei Zhang,
Annealed Sparsity via Adaptive and Dynamic Shrinking
Author(s): Kai Zhang*, NEC labs America; Shandian Shan, Purdue University; Zhengzhang Chen, NEC Lab America; Chaoran Cheng, New Jersey Institute of Technology; Zhi Wei, New Jersey Institute of Technology; Guofei Jiang, NEC labs America; Jieping Ye,
Multi-Task Feature Interaction Learning
Author(s): KAIXIANG LIN*, Michigan State University; Jianpeng Xu, Michigan State University; Shuiwang Ji, Washington State University; Jiayu Zhou, Michigan State University
Analyzing Volleyball Match Data from the 2014 World Championships Using Machine Learning Techniques
Author(s): Jan Van Haaren*, KU Leuven; Horesh Ben Shitrit, PlayfulVision; Jesse Davis, KU Leuven; Pascal Fua, EPFL
Lexis: An Optimization Framework for Discovering the Hierarchical Structure of Sequential Data
Author(s): Payam Siyari*, Georgia Institute of Technology; Bistra Dilkina, Georgia Tech; Constantine Dovrolis, Georgia Institute of Technology
Just One More: Modeling Binge Watching Behavior
Author(s): William Trouleau, EPFL; Azin Ashkan*, Technicolor; Weicong Ding, Technicolor Research; Brian Eriksson, Technicolor
Inferring Network Effects from Observational Data
Author(s): David Arbour*, University of Massachusetts Am; Dan Garant, University of Massachusetts Amherst; David Jensen, UMass Amherst
Generalized Hierarchical Sparse Model for Arbitrary-Order Interactive Antigenic Sites Identification
Author(s): Lei Han*, Rutgers University; Yu Zhang, Hong Kong University of Science and Technology; Xiu-Feng Wan, Mississippi State University; Tong Zhang, Rutgers University
Predict Risk of Relapse for Patients with Multiple Stages of Treatment of Depression
Author(s): Zhi Nie*, Arizona State University; Pinghua Gong, ; Jieping Ye, University of Michigan at Ann Arbor
A Closed-Loop Approach in Data-Driven Resource Allocation to Improve Network User Experience
Author(s): Yanan Bao*, University of California, Davi; Huasen Wu, UC Davis; Xin Liu, UC Davis
Towards Robust and Versatile Causal Discovery for Business Applications
Author(s): Giorgos Borboudakis*, University of Crete; Ioannis Tsamardinos,
Interpretable Decision Sets: A Joint Framework for Description and Prediction
Author(s): Himabindu Lakkaraju*, Stanford University; Stephen Bach, Stanford University; Jure Leskovec, Stanford University
Causal Clustering for 1-Factor Measurement Models
Author(s): Erich Kummerfeld*, University of Pittsburgh; Joseph Ramsey, Carnegie Mellon University
Efficient Frequent Directions Algorithm for Sparse Matrices
Author(s): Mina Ghashami*, University of utah; Edo Liberty, Yahoo ; Jeff Phillips, School of Computing, University of Utah
Subjectively Interesting Component Analysis: Data Projections that Contrast with Prior Expectations
Author(s): Bo Kang*, Ghent University; Jefrey Lijffijt, Ghent University; Raul Santos-Rodriguez, University of Bristol; Tijl De Bie, Ghen University
Robust and Effective Metric Learning Using Capped Trace Norm
Author(s): Zhouyuan Huo, University of Texas, Arlington; Feiping Nie, University of Texas at Arlington; Heng Huang*, Univ. of Texas at Arlington

Comments