Home / Topics


Curated by: Aidong Zhang

Classification – Assigning labels to objects is one of the cornerstone application/task in data mining. Many day-to-day activities, some so involuntary that we don’t even realize doing it, are classification tasks – “Identifying your car in the parking lot” or “Recognizing your family member in a crowd”. These seemingly simple tasks for humans, however, is extremely difficult for computers and forms the core of AI problems.

The domain applications of classification have expanded from early days of hand-written digit recognition and face recognition tasks in 90s to identifying and classifying data in high-throughput environment like bioinformatics and social media.

With “big” data and growth of deep learning algorithms, a new paradigm of techniques and use-cases have arisen which significantly reduces human intervention in training the system/algorithm. Some very interesting demos/applications are listed in [1] in which a specific example of online handwritten character recognition available here: http://deep.host22.com/.

A general introduction and survey of classification algorithms can be found in [2].

[1] http://deeplearning.net/demos/

[2] http://wen.ijs.si/ojs-2.4.3/index.php/informatica/article/download/148/140

Related KDD2016 Papers

Title & Authors
The Profile of an Online Purchaser: A Case Study of Pinterest
Author(s): Caroline Lo*, Stanford University; Dan Frankowski, ; Jure Leskovec, Stanford University
Dynamic and Robust Wildfire Risk Prediction System: An Unsupervised Approach
Author(s): Mahsa Salehi*, IBM Australia; Laura Rusu, IBM Research; Timothy Lynar, IBM Research; Anna Phan, IBM Research
Designing Policy Recommendations to Reduce Home Abandonment in Mexico
Author(s): Klaus Ackermann, Monash University; Eduardo Blancas Reyes*, The University of Chicago; Sue He, University of Virginia; Thomas Anderson Keller, UC San Diego; Paul van der Boor, Data Science for Social Good; Romana Khan, Data Science for Social Good; Rayid Ghani, University of Chicago
Gemello: Creating a Detailed Energy Breakdown from just the Monthly Electricity Bill
Author(s): Nipun Batra*, IIIT Delhi; Amarjeet Singh, ; Kamin Whitehouse,
Days on Market: Measuring Liquidity in Real Estate Markets
Author(s): Hengshu Zhu*, Baidu Inc.; Hui Xiong, Rutgers; Fangshuang Tang, University of Science and Technology of China; Yong Ge, ; Qi Liu, University of Science and Technology of China; Enhong Chen, ; Yanjie Fu, Rutgers University
A Non-parametric Approach to Detect Epileptogenic Lesions using Restricted Boltzmann Machines
Author(s): Yijun Zhao*, Tufts University; Bilal Ahmed, Tufts; Carla Brodley, Northeastern University; Jennifer Dy, NEU
Domain adaptation in the absence of source domain data
Author(s): Boris Chidlovskii*, XRCE; Stephane Clinchant, Xerox Research Centre Europe; Gabriela Csurka, Xerox Research Centre Europe
EMBERS AutoGSR: Automated Coding of Civil Unrest Events
Author(s): PARANG SARAF*, VIRGINIA TECH; Naren Ramakrishnan, Virginia Tech
An Engagement-Based Customer Lifetime Value System for E-commerce
Author(s): Ali Vanderveld*, Groupon; Angela Han, Groupon; Addhyan Pandey, Groupon; Rajesh Parekh,
Predictors without Borders: Behavioral Modeling of Product Adoption in Three Developing Countries
Author(s): Muhammad Khan, University of Washington; Joshua Blumenstock*, University of Washington
Bid-aware Gradient Descent for Unbiased Learning with Censored Data in Display Advertising
Author(s): Weinan Zhang*, University College London; Tianxiong Zhou, TukMob; Jun Wang, University College London; Jian Xu, TouchPal Inc
Repeat Buyer Prediction for E-Commerce
Author(s): Guimei Liu*, ; Tam T. Nguyen, Institute for Infocomm Research; Gang Zhao, Development Bank of Singapore; Wei Zha, Institute for Infocomm Research; Jianbo Yang, General Electric; Jianneng Cao, Institute for Infocomm Research; Min Wu, Institute for Infocomm Research; Peilin Zhao, Institute for Infocomm Research, A*STAR; Wei Chen, Development Bank of Singapore
Developing a Data-Driven Player Ranking in Soccer using Predictive Model Weights
Author(s): Joel Brooks*, Massachusetts Institute of Tec; Matthew Kerr, Massachusetts Institute of Technology; John Guttag, MIT
Identifying Earmarks in Congressional Bills
Author(s): Vrushank Vora*, Data Science for Social Good; Joe Walsh, Data Science for Social Good; Madian Khabasa, Microsoft ; Ellery Wulczyn, Wikimedia Foundation; Matthew Heston, Northwestern University; Rayid Ghani, University of Chicago; Chris Berry, University of Chicago
Text Mining in Clinical Domain: Dealing with Noise.
Author(s): Hoang Nguyen*, National ICT Australia; Jon Patrick, University of Sydney
Ranking Relevance in Yahoo Search
Author(s): Dawei Yin, Yahoo Labs; Yuening Hu, ; Jiliang Tang, Yahoo Labs; Tim Daly, yahoo; Mianwei Zhou, Yahoo Inc; Hua Ouyang, ; Jianhui Chen, Yahoo!; Changsung Kang, Yahoo Labs; Hongbo Deng, Yahoo!; Chikashi Nobata, ; Jean-Marc Langlois, ; Yi Chang*, Yahoo! Labs
Question Independent Grading using Machine Learning: The Case of Computer Program Grading
Author(s): Gursimran Singh*, Aspiring Minds; Shashank Srikant, ; Varun Aggarwal,
How to Get Them a Dream Job?
Author(s): Jia Li, University of Illinois at Chicago; Dhruv Arya, LinkedIn; Viet Ha-Thuc*, LinkedIn; Shakti Sinha, LinkedIn
DopeLearning: A Computational Approach to Rap Lyrics Generation
Author(s): Eric Malmi*, Aalto University; Pyry Takala, Aalto University; Hannu Toivonen, University of Helsinki; Tapani Raiko, Aalto University; Aristides Gionis, Aalto University
The Legislative Influence Detector: Finding Text Reuse in State Legislation
Author(s): Matthew Burgess, University of Michigan; Eugenia Giraudy, YouGov; Julian Katz-Samuels, University of Michigan; Joe Walsh*, University of Chicago; Derek Willis, ProPublica; Lauren Haynes, University of Chicago; Rayid Ghani, University of Chicago
Detecting Devastating Diseases in Search Logs
Author(s): John Paparrizos, Columbia University; Ryen White*, Microsoft; Eric Horvitz, Microsoft Research
Audience Expansion for Online Social Network Advertising
Author(s): Haishan Liu*, LinkedIn Corporation; David Pardoe, LinkedIn Corporation; Kun Liu, LinkedIn Corporation
Identifying Police Officers at Risk of Adverse Events
Author(s): Samuel Carton, University of Michigan; Jennifer Helsby*, University of Chicago; Kenneth Joseph, Carnegie Mellon University; Ayesha Mahmud, Princeton University; Youngsoo Park, University of Arizona; Joe Walsh, University of Chicago; Crystal Cody, Charlotte-Mecklenburg Police Department; Estella Patterson, Charlotte-Mecklenburg Police Department; Lauren Haynes, University of Chicago; Rayid Ghani, University of Chicago
Crystal:Employer Name Normalization in the Online Recruitment Industry
Author(s): Qiaoling Liu, CareerBuilder; Faizan Javed*, CareerBuilder; Matt McNair, CareerBuilder
Firebird: Predicting Fire Risk and Prioritizing Fire Inspections in Atlanta
Author(s): Michael Madaio, Carnegie Mellon University; Shang-Tse Chen*, Georgia Institute of Technology; Oliver Haimson, University of California, Irvine; Wenwen Zhang, Georgia Institute of Technology; Xiang Cheng, Emory University; Matthew Hinds-Aldrich, Atlanta Fire Rescue Department; Duen Horng Chau, Georgia Tech; Bistra Dilkina, Georgia Tech
Predicting Disk Replacement towards Reliable Data Centers
Author(s): Mirela Botezatu*, IBM Research; Ioana Giurgiu, IBM Research; Jasmina Bogojeska, IBM Research; Dorothea Wiesmann, IBM Research