KDD 2011 Banner
KDD-2011 Program
Session Times (SAT, SUN, MON, TUE, WED)
A1, A2, A3, A410:30 AM - 12:15 PM
B1, B23:00 PM - 4:15 PM
C1, C24:30 PM - 5:45 PM
Saturday: 2-day Workshop 9:00 AM - 5:00 PM
Sunday Full-day and 2-day Workshops 9:00 AM - 5:00 PM
Sunday Half-day Workshops8:00 AM - 12:00 PM and 1:00 PM - 5:00 PM
Sunday Invited Tutorials8:00 AM - 12:00 PM and 1:00 PM - 5:00 PM
Sunday Plenary Opening Sessions (Awards Presentation and Innovation Talk) 6:00 PM - 8:00 PM
Keynote Session Times (MON, TUE, WED)
K1, K3, K59:00 AM - 10:15 AM
K2, K4, K61:30 PM - 2:45 PM
Poster Session/Reception Times (MON, TUE)
6:15 - 9:00 PM
Breaks (MON, TUE, WED)
Coffee10:15-10:30 AM, 2:45 - 3:00 PM, 4:15-4:30 PM
Lunch12:15-1:30 PM

KEYNOTE TALKMONK1Convex Optimization: from Embedded Real-Time to Large-Scale DistributedStephen Boyd, Stanford
CLASSIFICATIONMONA1CHIRP: A new classifier based on Composite Hypercubes on Iterated Random ProjectionsLeland Wilkinson, Systat; Anushka Anand*, UIC; Tuan Dang, UIC
MONA1Supervised Learning for Provenance-Similarity of BinariesSagar Chaki*, Carnegie Mellon University; Cory Cohen, ; Arie Gurfinkel,
MONA1Trading Representability for Scalability: Adaptive Multi-Hyperlane Machine for Nonlinear ClassificationZhuang Wang*, Siemens; Nemanja Djuric, ; Koby Crammer, ; Slobodan Vucetic,
MONA1An Improved GLMNET for L1-regularized Logistic RegressionGuo-Xun Yuan, National Taiwan University; Chia-Hua Ho, National Taiwan University; Chih-Jen Lin*, National Taiwan University
WEB USER MODELINGMONA2Scalable Inference of Dynamic User Interests for Behavioural TargetingAmr Ahmed*, Carnegie Mellon University; Yucheng Low, Carnegie Mellon University; Mohamed Aly, Yahoo Research; Vanja Josifovski , Yahoo! Research; Alex Smola, Yahoo! Research
MONA2Multiple Domain User PersonalizationYucheng Low*, Carnegie Mellon University; Alex Smola, Yahoo and ANU; Deepak Agarwal,
MONA2Click Shaping to Optimize Multiple ObjectivesXuanhui Wang, Yahoo! Labs; Deepak Agarwal*, ; Bee-Chung Chen, Yahoo! Research; Pradheep Elango,
MONA2Response prediction using collaborative filtering with hierarchies and side-informationAditya Menon*, UC San Diego; Krishna-Prasad Chitrapura, Yahoo! Labs Bangalore; Sachin Garg, Yahoo! Labs Bangalore; Deepak Agarwal, Yahoo! Research; Nagaraj Kota, Yahoo! Labs Bangalore
KEYNOTE TALKMONK2Internet Scale Data AnalysisPeter Norvig, Google
TEXT MININGMONB1Beyond Keyword Search: Discovering Relevant Scientific LiteratureKhalid El-arini*, Carnegie Mellon University; Carlos Guestrin, CMU
MONB1Collaborative Topic Models for Recommending Scientific ArticlesChong Wang*, Princeton University; David Blei, Princeton Univ
MONB1Partially Labeled Topic Models for Interpretable Text MiningDaniel Ramage*, Stanford University; Christopher Manning, Stanford University; Susan Dumais, Microsoft Research
SOCIAL NETWORKSMONB2On the Semantic Annotation of Places in Location-based Social NetworksMao Ye*, PSU; Dong Shou, ; Wang-Chien Lee, ; Peifeng Yin, ; Krzysztof Janowicz,
MONB2Sparsification of Influence NetworksMichael Mathioudakis*, University of Toronto; Francesco Bonchi, Yahoo! Research; Carlos Castillo, Yahoo!; Aristides Gionis, Yahoo! Research Barcelona; Antti Ukkonen,
MONB2Leveraging Collaborative Tagging for Web Item DesignMahashweta Das*, UTA; Gautam Das, UT Arlington; Vagelis Hristidis, Florida International University
TEXT MININGMONC1Latent Topic Feedback for Information RetrievalDavid Andrzejewski*, Lawrence Livermore National La; David Buttler, Lawrence Livermore National Laboratory
MONC1Locality-Sensitive Factor Models for Multi-Context RecommendationDeepak Agarwal*, ; Bee-Chung Chen, Yahoo! Research; Bo Long,
MONC1Latent Aspect Rating Analysis without Aspect Keyword SupervisionHongning Wang*, UIUC; Yue Lu, University of Illinois; ChengXiang Zhai, UIUC
SCALABILITYMONC2Fast Clustering using MapReduceAlina Ene, University of Illinois at Urbana-Champaign; Sungjin Im*, University of Illinois; Benjamin Moseley, University of Illinois at Urbana-Champaign
MONC2Clustering Very Large Multi-dimensional Datasets with MapReduceRobson Leonardo Ferreira Cordeiro*, ICMC-USP-Brazil; Caetano Traina Jr., ICMC-USP; Agma Juci Machado Traina, ICMC-USP; Julio López, SCS-CMU; U Kang, Carnegie Mellon University; Christos Faloutsos, CMU
MONC2Selective Block Minimization for Faster Convergence of Limited Memory Large-scale Linear ModelsKai-Wei Chang*, UIUC; Dan Roth, University of Illinois at Urbana-Champaign
KEYNOTE TALKTUEK3Cancer GenomicsDavid Haussler, UC Santa Cruz
MATRIX FACTORIZATIONTUEA1Integrating Low-Rank and Group-Sparse Structures for Robust Multi-Task LearningJianhui Chen*, Arizona State University; Jiayu Zhou, Arizona State University; Jieping Ye, Arizona State University
TUEA1Model Order Selection for Boolean Matrix FactorizationPauli Miettinen*, MPI Informatics; Jilles Vreeken, University of Antwerp, Belgium
TUEA1Rank Aggregation via Nuclear Norm MinimizationDavid Gleich*, Sandia National Laboratories; Lek-Heng Lim, University of Chicago
TUEA1Large-Scale Matrix Factorization with Distributed Stochastic Gradient DescentRainer Gemulla*, Max-Planck Institut; Peter Haas, IBM Almaden; Erik Nijkamp, IBM Almaden; Yannis Sismanis, IBM Almaden
USER MODELINGTUEA2From Bias to Opinion: A Transfer-Learning Approach to Sentiment AnalysisPedro Henrique Guerra*, UFMG; Adriano Veloso, UFMG; Wagner Meira Junior, UFMG; Virgilio Almeida, UFMG
TUEA2User Reputation in a Comment Rating EnvironmentBee-Chung Chen*, Yahoo! Research; Belle Tseng, Yahoo! Labs; Jie Yang, Yahoo! Labs; Jian Guo, University of Michigan
TUEA2Selecting a Comprehensive Set of ReviewsPanayiotis Tsaparas*, Microsoft Research; Alexandros Ntoulas, Microsoft Research; Evimaria Terzi, Boston University
KEYNOTE TALKTUEK4The Mathematics of Causal InferenceJudea Pearl, UCLA
TEXT MININGTUEB1Refining causality: who copied from whom?Tristan Snowsill*, University of Bristol; Nick Fyson, University of Bristol; Tijl De Bie, University of Bristol; Nello Cristianini, University of Bristol
TUEB1Conditional Topical Coding: an Efficient Topic Model Conditioned on Rich FeaturesJun Zhu*, Carnegie Mellon University; Ni Lao, Carnegie Mellon University; Ning Chen, Tsinghua University; Eric Xing, CMU
TUEB1Tracking Trends: Incorporating Term Volume into Temporal Topic ModelsLiangjie Hong*, Lehigh University; Dawei Yin, lehigh University; Jian Guo, University of Michigan; Brian Davison, Lehigh University
THEORYTUEB2Stackelberg Games for Adversarial Prediction ProblemsMichael Brückner*, University of Potsdam; Tobias Scheffer, University of Potsdam
TUEB2Leakage in Data Mining: Formulation, Detection, and AvoidanceClaudia Perlich*, Media6Degrees; Shachar Kaufman, Tel-Aviv University; Saharon Rosset, Tel Aviv University
TUEB2An information theoretic framework for data miningTijl De Bie*, University of Bristol
UNSUPERVISED LEARNINGTUEC1Density Estimation TreesParikshit Ram*, Geogia Institute of Technology; Alexander Gray, Georgia Tech
TUEC1Unsupervised Clustering of Multidimensional Distributions using Earth Mover DistanceDavid Applegate, AT&T Labs - Research; Tamraparni Dasu*, AT&T Labs; Shankar Krishnan, AT&T Labs - Research; Simon Urbanek, AT&T Labs - Research
TUEC1Online heterogeneous mixture modeling with marginal and copula selectionRYOHEI FUJIMAKI*, NEC Laboratories America; Yasuhiro Sogawa, ; Satosi Morinaga,
PREDICTIVE MODELINGTUEC2Bounded Coordinate-Descent for Biological Sequence Classification in High Dimensional Predictor SpaceGeorgiana Ifrim*, Bioinformatics Research Centre; Carsten Wiuf, Bioinformatics Research Centre
TUEC2Multi-Source Domain Adaptation and Its Application to Early Detection of FatigueRita Chattopadhyay, Arizona State University; Jieping Ye*, Arizona State University; Sethuraman Panchanathan, Arizona State University; Wei Fan, Columbia; Ian Davidson, UC Davis
TUEC2Two-locus association mapping in subquadratic runtimePanagiotis Achlioptas, ; Bernhard Schölkopf, Max Planck Institute; Karsten Borgwardt*, Max Planck Institutes T?bingen
GRAPH ANALYSISWEDA1Diversity in ranking via resistive graph centersKumar Dubey*, IBM Research; Soumen Chakrabarti, "Indian Institute of Technology, Bombay"; Chiru Bhattacharya, IISc
WEDA1Collective Graph IdentificationGalileo Namata*, University of Maryland; Stanley Kok, University of Maryland; Lise Getoor, "University of Maryland, College Park"
WEDA1Semi-Supervised Ranking on Very Large Graph with Rich MetadataBin Gao*, Microsoft Research Asia; Tie-Yan Liu, Microsoft Research Asia; Wei Wei, ; Taifeng Wang, Microsft research; Hang Li, Microsoft
WEDA1Benefits of Bias: Towards Better Characterization of Network SamplingArun Maiya*, UIC; Tanya Berger-Wolf, University of Illinois at Chicago
ONLINE DATA AND STREAMSWEDA2Enabling Fast Prediction for Ensemble Models on Data StreamsByron Gao*, Texas State University; Peng Zhang, Chinese Academy of Sciences; Xingquan Zhu, University of Technology, Sydney
WEDA2Online Active Inference and LearningJoshua Attenberg*, NYU Polytechnic Institute; Foster Provost, NYU
WEDA2Unbiased Online Active Learning in Data StreamsWei Chu*, Yahoo! Labs; Martin Zinkevich, Yahoo Research; Lihong Li, Yahoo! Research; Achint Thomas, Yahoo! Labs; Belle Tseng, Yahoo! Labs
WEDA2Learning to Trade Off Between Exploration and Exploitation in Multiclass Bandit PredictionHamed Valizadegan*, University of Pittsburgh; Rong Jin, Michigan State University; Shijun Wang, National Institute of Health
PANELWEDK6Lessons learned from contests in data miningModerator: Charles Elkan, UCSD; Speakers: Jeremy Howard (Kaggle), Yehuda Koren (Yahoo!), Tie-Yan Liu (Microsoft Research), Claudia Perlich (Media6Degrees)
PRIVACYWEDB1Differentially Private Data Release for Data MiningNoman Mohammed*, Concordia University; Rui Chen, Concordia University; Benjamin Fung, Concordia University; Mourad Debbabi, Concordia University; Philip Yu, University of Illinois at Chicago
WEDB1k-NN as an Implementation of Situation Testing for Discrimination Discovery and PreventionBinh Thanh Luong, Institute for Advanced Studies; Salvatore Ruggieri*, University of Pisa; Franco Turini, University of Pisa
WEDB1Exploiting Vulnerability to Secure User Privacy on Social Networking SitePritam Gundecha*, Arizona State University; Geoffrey Barbier, ASU; Huan Liu,
FREQUENT SETSWEDB2Tell me what I need to know: succinctly summarizing data with itemsetsMichael Mampaey*, Universiteit Antwerpen; Jilles Vreeken, University of Antwerp, Belgium; Nikolaj Tatti, University of Antwerp
WEDB2Direct Local Pattern Sampling by Efficient Two-Step Random ProceduresMario Boley*, Fraunhofer IAIS; Claudio Lucchese, "ISTI - CNR, Italy"; Daniel Paurat, University Bonn; Thomas Gärtner, University Bonn
WEDB2Mining Frequent Closed Graphs on Evolving Data StreamsAlbert Bifet*, University of Waikato; Geoff Holmes, University of Waikato; Bernhard Pfahringer, University of Waikato; Ricard Gavaldà, UPC-Barcelona Tech
GRAPH MININGWEDC1Dual Active Feature and Sample Selection for Graph ClassificationXiangnan Kong, Univ of Illinois at Chicago; Wei Fan, Columbia; Philip Yu*, University of Illinois at Chicago
WEDC1It's Who You Know: Graph Mining Using Recursive Structural FeaturesKeith Henderson, Lawrence Livermore National Laboratory; Brian Gallagher, Lawrence Livermore National Laboratory; Lei Li, Carnegie Mellon University; Leman Akoglu, Carnegie Mellon University; Tina Eliassi-Rad*, LLNL; Hanghang Tong, IBM Research; Christos Faloutsos, CMU
WEDC1Triangle Listing in Massive Networks and Its ApplicationsShumo Chu*, NTU, Singapore; James Cheng, NTU, Singapore
Industry/Government Track
SessionPaper IDEvaluated as SubjectPrimary Subject AreaPaper TitleAuthors
Mon A3828DeployedText MiningLinear Scale Semantic Mining Algorithms in Microsoft SQL Server’s Semantic PlatformKunal Mukerjee; Todd Porter; Sorin Gherman
Mon A3904DeployedSecurityCombining File Content and File Relation for Cloud Based Malware DetectionYanfang Ye; Tao Li; Shenghuo Zhu; Weiwei Zhang; Melih Abdulhayoglu
Mon A3971DeployedText MiningHigh-Precision Phrase-Based Document Classification on a Modern ScaleRon Bekkerman; Matan Gavish
Mon A3706DeployedTemporal MiningActivity Analysis Based on Low Sample Rate Smart MetersFeng Chen; Jing Dai; Bingsheng Wang; Sambit Sahu; Milind Naphade; Chang-Tien Lu
Tue A355DeployedInternetEstimating the Number of Users behind IP Addresses for Combating Abusive TrafficAhmed Metwally; Matt Paduano
Tue A3812DeployedInternetData-driven Multi-touch Attribution ModelsXuhui Shao; Lexin Li
Tue A3788DeployedInternetBid Landscape Forecasting in Online Ad Exchange MarketplaceYing Cui; Ruofei Zhang; Wei Li; Jianchang Mao
Tue A3978DeployedInternetDetecting Adversarial Advertisements in the WildD. Sculley; Matthew Otey; Michael Pohl; Bridget Spitznagel; John Hainsworth; Yunkai Zhou
Wed A3722DeployedNoneDisaster Management on Mobile DevicesLi Zheng; Chao Shen; Liang Tang; Tao Li; Steve Luis; Shu-Ching Chen
Wed A394DiscoveryFinanceEnhancing Investment Decision in P2P Lending: An Investor Composition PerspectiveChunyu Luo; Hui Xiong; Wenjun Zhou; Yanhong Guo; Guishi Deng
Wed A3127DiscoveryScienceFrom Market Baskets to Mole Rats: Using Data Mining Techniques To Analyze RFID Data Describing Mole Rat BehaviorSusan Imberman; Michael Kress; Dan McCloskey; Igor Kushnir; Susan Briffa-Mirabella
Wed A31101DiscoveryRetailA Pattern Discovery Approach to Retail Fraud DetectionPrasad Gabbur; Sharat Pankanti; Quanfu Fan; Hoang Trinh
Mon A493EmergingGeo MiningDriving with Knowledge from the Physical WorldJing Yuan; Yu Zheng; Xing Xie; Guang-Zhong Sun
Mon A4813EmergingNoneInteractive Learning for Efficiently Detecting Errors In Insurance ClaimsRayid Ghani; Mohit Kumar
Mon A4380EmergingLarge Scale MiningNIMBLE: An Infrastructure for the Rapid Implementation of Parallel Data Mining and Machine Learning Algorithms on MapReduceAmol Ghoting; Prabhanjan Kambadur; Edwin Pednault; Ramakrishnan Kannan
Mon A4886EmergingInternetClassification of Proxy Labeled Examples for Marketing Segment GenerationDean Cerrato; Rosie Jones; Avi Gupta
Mon A4580EmergingInternetAmeliorating Buyer's RemorseSamuel Ieong; Rakesh Agrawal; Raja Velu
Tue A4147EmergingMedicalExperiences with Mining Temporal Event Sequences from Electronic Medical Records: Initial Successes and Some ChallengesDebprakash Patnaik; Patrick Butler; Naren Ramakrishnan; Laxmi Parida; Benjamin Keller; David Hanauer
Tue A4661EmergingMedicalUnderstanding Atrophy Trajectories in Alzheimer's Disease using Association Rules on MRI imagesGyorgy Simon; Peter Li; Clifford Jack; Prashanthi Vemuri
Tue A4670EmergingNoneA Case Study in a Recommender System Based on Purchase DataBruno Pradel; Nicolas Usunier; Francçoise Soulie Fogelman; Savaneary Sean; Julien Delporte; Celine Rouveirol; Sebastien Guérif; Frédéric Dufau-Joel
Tue A4727EmergingSecurityDetecting bots via incremental SVM learning with Dynamic Feature AdaptationFeilong Chen; Supranamaya Ranjan; Pang-Ning Tan
Tue A4381EmergingMedicalTowards Personalized Care Management of High-Risk Patients – the Diabetes Case StudyHani Neuvirth; Michal Ozery-Flato; Jianying Hu; Jonathan Laserson; Martin Kohn; Shahram Ebadollahi; Michal Rosen-Zvi
Wed A4949EmergingInternetMatching Unstructured Product Offers to Structured Product SpecificationsAnitha Kannan; Inmar Givoni; Rakesh Agrawal; Ariel Fuxman
Wed A41024EmergingInternetPredictive Client-side Profiles for Personalized AdvertisingMikhail Bilenko; Matthew Richardson
Wed A41128EmergingSocial MiningSmoothing Techniques for Adaptive Online Language Models: Topic Tracking in Tweet StreamsJimmy Lin; Rion Snow; William Morgan
Wed A41164EmergingSocial MiningDemocrats, Republicans and Starbucks aficionados: User classification in TwitterMarco Pennacchiotti; Ana-Maria Popescu
Industry Practice Expo
MON B1IntroductionUsama Fayyad (ChoozOn)
MON B1The Power of Analysis & DataDavid Norton (Caesars Entertainment)
MON C1Operational Security Analytics - Doing More with LessColleen McCue (GeoEye Analytics)
MON C1Applications of Data Mining & Machine Learning in Customer CareRavi Vijayaraghavan & P V Kanan (24/7 Customer)
TUE B1Knowledge Discovery & Data Mining in Pharmaceutical Cancer ResearchPaul Rejto (Pfizer)
TUE B1Real-Time Risk Control for Card Not PresentTai Hsu (Alibaba Group)
TUE C1Accelerating Large Scale Data Mining Using In-Database AnalyticsMario Inchiosa & Michele Chambers (IBM)
TUE C1Broad Scale Predictive Modeling and Marketing Optimization in Retail SalesDan Steinberg & Felipe Fernandez Martinez (Salford Systems and Interefe)
TUE C1Thriving as a Data Miner in the Real-WorldJohn Elder (Elder Research)
WED B1The Practitioner's Viewpoint to Data Mining - Key Lessons Learned in the Trenches and Case StudiesRichard Boire (Boire-Fuller Group)
WED B1Which Half is Wasted? Controlled Experiments to Measure Online Advertising EffectivenessDavid Reiley (Yahoo! Research)
WED C1Analytics for Political CampaignsRayid Ghani (Chief Scientist, Obama for America)
WED C1Feedback and Closing (joint session with the Industry and Government Track)Audience Participation