
PapersThe opportunity to submit papers to the full conference has expired, but many workshops (to be held on Sunday, 6/28, the first day of KDD-2009) will have their own calls for papers. Important Dates
Please note the earlier submission date and the earlier conference date. Accepted Research PapersA Generalized Co-HITS Algorithm and Its Application to Bipartite Graphs A LRT Framework for Fast Spatial Anomaly Detection A Multi-Relational Approach to Spatial Classification A Principled and Flexible Framework for Finding Alternative Clusterings A Viewpoint-based Approach for Interaction Graph Analysis Adapting the Right Measures for K-means Clustering An Association Analysis Approach to Biclustering Analyzing Patterns of User Content Generation in Online Social Networks Anomalous Window Discovery through Scan Statistics for Linear Intersecting Paths (SSLIP) Audience Selection for On-line Brand Advertising: Privacy-friendly Social Network Targeting Augmenting the Generalized Hough Transform to Enable the Mining of Petroglyphs BBM: Bayesian Browsing Model from Petabyte-scale Data Cross Domain Distribution Adaptation via Kernel Mapping Cartesian Contour: A Concise Representation for a Collection of Frequent Sets Category Detection Using Hierarchical Mean Shift Causality Quantification and Its Applications: Structuring and Modeling of Multivariate Time Series Characteristic Relational Patterns Classification of Software Behaviors for Failure Detection: A Discriminative Pattern Mining Approach Co-Clustering on Manifolds CoCo: Coding Cost for Parameter-free Outlier Detection Christian Bohm University of Munich; Katrin Haegler University of Munich; Nikola Muller Max Plank Institute of Biochemistry Martinsried Germany; Claudia Plant* Technische Universitat Munchen Co-evolution of Social and Affiliation Networks Collaborative Filtering with Temporal Dynamics Collective Annotation of Wikipedia Entities in Web Text Collusion-Resistant Anonymous Data Collection Method Combining Link and Content for Community Detection: A Discriminative Approach Connections between the Lines: Augmenting Social Networks with Text Consensus Group Based Stable Feature Selection Constant-Factor Approximation Algorithms for Identifying Dynamic Communities Constrained Optimization for Validation-Guided Conditional Random Field Learning Correlated Itemset Mining in ROC Space: A Constraint Programming Approach CP-Summary: A Concise Representation for Browsing Frequent Itemsets Detection of Unique Temporal Segments by Information Theoretic Meta-clustering Differentially-Private Recommender Systems DOULION: Counting Triangles in Massive Graphs with a Coin Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-term Interactions DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values Effective Multi-Label Active Learning for Text Classification Efficient Anomaly Monitoring Over Moving Object Trajectory Streams Efficient Influence Maximization in Social Networks Efficient Methods for Topic Model Inference on Streaming Document Collections Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling Exploiting Wikipedia as External Knowledge for Document Clustering Exploring Social Tagging Graph for Web Object Classification Extracting Discriminative Concepts for Domain Adaptation in Text Mining Fast Approximate Spectral Clustering Feature Shaping for Linear SVM Classifiers Finding a Team of Experts in Social Networks Frequent Pattern Mining with Uncertain Data Genre-based Decomposition of Email Class Noise Grouped Graphical Granger Modeling Methods for Temporal Causal Modeling Heterogeneous Source Consensus Learning via Decision Propagation and Negotiation Improving Clustering Stability with Combinatorial MRFs Improving Data Mining Utility with Projective Sampling Information Theoretic Regularization for Semi-Supervised Boosting Issues in Evaluation of Stream Learning Algorithms Large Human Communication Networks: Patterns and a Utility-Driven Generator Large-Scale Behavioral Targeting Large-Scale Graph Mining Using Backbone Refinement Classes Large-Scale Sparse Logistic Regression Learning Optimal Ranking with Tensor Factorization for Tag Recommendation Learning Patterns in the Dynamics of Biological Networks Learning with a Nonexhaustive Training Dataset Learning Indexing and Diagnosing Network Faults Measuring the Effects of Preprocessing Decisions and Network Forces in Dynamic Network Analysis Meme-tracking and the Dynamics of the News Cycle MetaFac: Community Discovery via Relational Hypergraph Factorization Mind the Gaps: Weighting the Unknown in Large-Scale One-Class Collaborative Filtering Mining Broad Latent Query Aspects from Search Sessions Mining Discrete Patterns via Binary Matrix Factorization Mining for the Most Certain Predictions from Dyadic Data Mining Rich Session Context to Improve Web Search Mining Social Networks for Personalized Email Prioritization Characterizing Individual Communication Patterns Multi-focal Learning and Its Application to Customer Service Support New ensemble methods for evolving data streams On Burstiness-aware Search for Document Sequences On Compressing Social Networks On the Tradeoff Between Privacy and Utility in Data Publishing Parallel Community Detection on Large Networks with Propinquity Dynamics Primal Sparse Max-Margin Markov Networks Probabilistic Frequent Itemset Mining in Uncertain Databases Quantification and Semi-supervised Classification Methods for Handling Changes in Class Distribution Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema Regression based Latent Factor Models Regret-based Online Ranking for a Growing Digital Library Relational Learning via Latent Social Dimensions Scalable Graph Clustering Using Flows: Applications to Community Discovery Scalable Pseudo-Likelihood Estimation in Hybrid Random Fields Social Influence Analysis in Large-scale Networks Spatial-temporal causal modeling for climate change attribution Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature TANGENT: A Novel, "Surprise-Me", Recommendation Algorithm Tell Me Something I Don't Know: Randomization Strategies for Iterative Data Mining Temporal Mining for Interactive Workflow Data Analysis The Offset Tree for Learning with Partial Labels Time Series Shapelets: A New Primitive for Data Mining Toward Autonomic Grids: Analyzing the Job Flow with Affinity Streaming Towards Efficient Mining of Proportional Fault-Tolerant Frequent Itemsets TrustWalker : A Random Walk Model for Combining Trust-based and Item-based Recommendation Turning Down the Noise in the Blogosphere User Grouping Behavior in Online Forums Using Graph-based Metrics with Empircial Risk Minimization to Speed Up Active Learning on Networked Data WhereNext: a Location Predictor on Trajectory Pattern Mining Accepted Industrial PapersA Case Study of Behavior-driven Conjoint Analysis on Yahoo! Front Page Today Module Wei Chu*, Yahoo! Labs; Seung-Taek Park, Yahoo! Inc.; Todd Beaupre, Yahoo! Inc.; Nitin Motgi, Yahoo! Inc.; Amit Phadke, Yahoo! Inc.; Seinjuti Chakraborty, Yahoo! Inc.; Joe Zachariah, Yahoo! Inc. Address Standardization with Latent Semantic Association Honglei Guo*, IBM China Research Lab; Huijia Zhu, IBM China Research Lab; Zhili Guo, IBM China Research Lab; Xiaoxun Zhang, IBM China Research Lab; Zhong Su, IBM China Research Lab Anonymizing Healthcare Data: A Case Study on the Blood Transfusion Service Noman Mohammed, Concordia University; Benjamin C. M. Fung*, Concordia University; Patrick C. K. Hung, University of Ontario Institute of Technology; Cheuk-kwong Lee, Hong Kong Red Cross Blood Transfusion Service Applying Syntactic Similarity Algorithms for Enterprise Information Management Lucy Cherkasova*, HPLabs; Kave Eshghi, HPLabs; Brad Morrey, HPLabs; Joseph Tucek, HPLabs; Alistair Veitch, HPLabs Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs Justin Ma*, UC San Diego; Lawrence Saul, UCSD; Stefan Savage, UC San Diego; Geoffrey Voelker, UC San Diego BGP-lens: Patterns and Anomalies in Internet Routing Updates B. Aditya Prakash*, Carnegie Mellon University; Nicholas Valler, UCR; David Andersen, CMU; Michalis Faloutsos, UCR; Christos Faloutsos, CMU Can We Learn a Template-Independent Wrapper for News Article Extraction from a Single Training Site? Junfeng Wang*, Zhejiang university; Xiaofei He, ; Can Wang, ; Jian Pei, Simon Fraser University; Jiajun Bu, ; Chun Chen, ; Ziyu Guan, ; Wei Vivian Zhang, Microsoft Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta*, University of Texas at Austin; Mikhail Bilenko, Microsoft Research; Matthew Richardson, Microsoft Research Clustering of Event Logs Using Iterative Partitioning Adetokunbo Makanju*, Dalhousie University; Nur Zincir-Heywood, Dalhousie University; Evangelos Milios, Dalhousie University COA: Finding Novel Patents through Text Analysis Mohammad Al Hasan*, RPI; W. Scott Spangler, IBM Corporation; Thomas Griffin, IBM Corporation; Alfredo Alba, IBM Corporation Enabling Analysts in Managed Services for CRM Analytics Indrajit Bhattacharya, IBM Research; Shantanu Godbole*, IBM Research; Ajay Gupta, IBM Research; Ashish Verma, IBM Research; Jeff Achtermann, IBM MBPS; Kevin English, IBM Entity Discovery and Assignment for Opinion Mining Applications Xiaowen Ding*, Univ of Illinois at Chicago; Bing Liu, UIC; Lei Zhang, UIC Grocery Shopping Recommendations Based on Basket-Sensitive Random Walk Ming Li*, Unilever UK; Malcolm Dias, Unilever UK; Ian Jarman, Liverpool John Moores University; Wael El-Deredy, University of Manchester; Paulo Lisboa, Liverpool John Moores University Incorporating Site-Level Knowledge for Incremental Crawling of Web Forums: A List-wise Strategy Jiang-Ming Yang*, Microsoft Research Asia; Rui Cai, Microsoft Research; Chunsong Wang, University of Wisconsin-Madison; Hua Huang, Beijing University of Posts and Telecommunications; Lei Zhang, Microsoft Research Asia; Wei-Ying Ma, Microsoft Research Asia Intelligent File Scoring System for Malware Detection from the Gray List Tao Li*, Florida International University Learning Dynamic Temporal Graphs for Oil-drilling Equipment Monitoring System Yan Liu*, IBM Research; Jayant Kalagnanam Migration Motif: A Spatial-Temporal Pattern Mining Approach for Financial Markets Xiaoxi Du, KSU; Ruoming Jin*, Kent State University; Liang Ding, Kent State University; Victor Lee, Kent State University; John Thornton, Kent State University Mining Brain Region Connectivity for Alzheimer's Disease Study via Sparse Inverse Covariance Estimation Liang Sun*, Arizona State University; Rinkal Patel, Arizona State University; Jun Liu, Arizona State University; Kewei Chen, Neuroimaging Banner Alzheimer's Institute; Teresa Wu, Arizona State University; Jing Li, Arizona State University; Eric Reiman, Banner Alzheimer's Institute and Banner PET Center; Jieping Ye, Arizona State University Modeling and Predicting User Behavior in Sponsored Search Joshua Attenberg*, NYU Polytechnic Institute; Torsten Suel, Yahoo Research; Sandeep Pandey, Yahoo Research Named Entity Mining from Click-Through Log Using Weakly Supervised Latent Dirichlet Allocation Gu Xu*, Microsoft Research Asia; Shuang-Hong Yang, Georgia Tech; Hang Li, Microsoft Research Asia Network Anomaly Detection based on Eigen Equation Compression Shunsuke Hirose*, NEC Corporation; Kenji Yamanishi, ; Takayuki Nakata, ; Ryohei Fujimaki OLAP on Search Logs: An Infrastructure Supporting Data-Driven Applications in Search Engines Bin Zhou, Simon Fraser University; Daxin Jiang*, MSRA; Jian Pei, Simon Fraser University; Hang Li, Microsoft Research Asia OpinionMiner: A Machine Learning System for Web Opinion Mining and Extraction Wei Jin*, North Dakota State University; Hung Hay Ho Pervasive Parallelism in Data Mining: Dataflow solution to Co-clustering Large and Sparse Netflix Data Srivatsava Daruru, University of Texas at Austin; Nena Marin*, Pervasive Software; Matthew Walker, Pervasive Software; Joydeep Ghosh, The University of Texas at Austin Predicting Bounce Rates in Sponsored Search Advertisements D. Sculley*, Google, Inc.; Robert Malkin, Google, Inc; Sugato Basu, Google, Inc; Roberto Bayardo, Google PSkip: Estimating relevance ranking quality from web search clickthrough data Kuansan Wang*, Microsoft Research; Toby Walker, ; Zijian Zheng Query Result Clustering for Object-level Search Jongwuk Lee, ; Seung-won Hwang*, Postech; Zaiqing Nie, ; Ji-Rong Wen, Microsoft Research Asia Improving Classification Accuracy Using Automatically Extracted Training Data Ariel Fuxman*, Microsoft, USA; Anitha Kanna, Microsoft, USA; Andrew Goldberg, University of Wisconsin; Rakesh Agrawal, Microsoft; Panayiotis Tsaparas, Microsoft Sentiment Analysis of Blogs by Combining Lexical Knowledge with Text Classification Prem Melville*, IBM; Wojciech Gryc, ; Richard Lawrence, IBM, USA Seven Pitfalls to Avoid when Running Controlled Experiments on the Web Thomas Crook, Microsoft; Brian Frasca, Microsoft; Ron Kohavi*, Microsoft; Roger Longbotham, Microsoft SNARE: A Link Analytic System for Graph Labeling and Risk Detection Mary McGlohon*, Carnegie Mellon University; Stephen Bay, PricewaterhouseCoopers; Markus Anderle, PricewaterhouseCoopers; David Steier, PricewaterhouseCoopers; Christos Faloutsos, CMU Sustainable Operation and Management of Data Center Chillers using Temporal Data Mining Debprakash Patnaik, Virginia Tech; Manish Marwah, HP Labs; Ratnesh Sharma, HP Labs; Naren Ramakrishnan*, Virginia Tech Towards a Universal Marketplace over the Web: Statistical Multi-label Classification of Service Provider Forms with Simulated Annealing Kivanc Ozonat*, HP Labs Towards Combining Web Classification and Web Information Extraction: A Case Study Ping Luo*, HP Labs China Paper Submission (Expired)
Research Track Papers (Expired)Call for PapersWe invite submissions on all aspects of knowledge discovery and data mining. We especially encourage papers relevant to KDD that cut across disciplines such as machine learning, pattern recognition, statistics, databases, theory, mathematical optimization, data compression, cryptography, and high performance computing. Papers are expected to describe innovative ideas and solutions that are rigorously evaluated and well-presented. Submissions that describe minor variations of existing methods or only make small or questionable improvements to existing algorithms are discouraged. Areas of interest include, but are not limited to:
All submitted papers will be judged based on their technical merit, rigor, significance, originality, repeatability, relevance, and clarity. Papers submitted to KDD'09 should be original work, not previously published in a peer-reviewed conference or journal. Substantially similar versions of the paper submitted to KDD'09 should not be under review in another peer-reviewed conference or journal during the KDD-09 reviewing period. Repeatability guideline: Repeatability is a cornerstone of any scientific endeavor. To ensure the long term viability of the research output of the SIGKDD community, we encourage open-source/public availability of the code and the datasets. In those cases where this is not possible due to proprietary considerations, every effort should be made to provide the binary executable and to apply the approach to similar publicly available datasets. When the latter is also not possible, please include a justification to that effect. Furthermore, the description of experimental results in submitted papers should be accompanied by all relevant implementation details and exact parameter specifications. Peter Flach and Mohammed Zaki, KDD'09 Program Co-Chairs Research Track Paper Preparation and Submission Guidelines (Expired)All papers should adhere to the ACM proceedings template, available from: http://www.acm.org/sigs/publications/proceedings-templates . Papers are allowed at most *nine* (9) full pages, including *all* figures, tables, references and appendix (if any). See writing guidelines below for additional details. The papers will *not* be reviewed double-blind, thus the authors do not have to obfuscate references to their prior work. In writing your paper, we suggest you try to address the following questions, credited to George Heilmeier:
In light of the above principles, we suggest the following guidelines for the paper content. Note that the headings and the structure below are meant to be general categories; please exercise your discretion and creativity to make the paper as comprehensible as possible to the readers and reviewers. AbstractTry to include the following:
Motivation & SignificanceWhat is the problem and why is it important or significant? Problem StatementFormal definition of the problem with any preliminary concepts. Prior Work & LimitationsWhat are the existing approaches, and their limitations? Theory/Algorithm
Experiments or other Evidence of Success
Discussion and Future WorkDescribe insights you gained, the limitations and applicability of your work, and directions for future research. Every solution has limitations, which should be explicitly mentioned. ReferencesInclude the most relevant works, making sure all citations are complete (including editors, publishers, page numbers, etc.). APPENDIXYou should use the appendix for supporting details. For example, you may use it to convey detailed technical/practical aspects of your implementation. You may use the appendix for theorem proofs, or for additional experimental results. Include include pointers in the main paper to relevant sections in the appendix. The appendix is an integral part of the paper, since it will provide details that are important for a proper appreciation of your work (e.g., for replicating or extending it, or for comparison). However, it should be possible on a first read-through to get a good understanding of the paper's contribution from the main part alone. Structuring the paper in this way provides a service to the reader, by separating main ideas from technical details. Submission (Expired - No more submissions accepted.)Please submit your paper electronically at the link below. First, sign-up. Then, choose the appropriate track for your paper. https://cmt.research.microsoft.com/KDD2009/ Industrial/Government Applications Track (Expired)Call for PapersDue Feb 6, 2009 June 28 - July 1, 2009. Paris, France. The Industrial/Government Applications Track of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2009) will highlight challenges, lessons, and research issues arising out of deploying applications of KDD technology. The focus is on promoting the exchange of ideas between researchers and practitioners of data mining. The KDD-2009 Industrial/Government Applications (I/G) Track seeks to:
The I/G Applications Track solicits papers describing implementations of KDD solutions relevant to commercial or government settings. The primary emphasis is on papers that advance our understanding of practical, applied, or pragmatic issues and highlight new research challenges in real KDD applications. Applications can be in any field including, but not limited to: e-commerce, medical and pharmaceutical, defense, public policy, engineering, manufacturing, telecommunications, and government. Being held in Europe for the first time, we enthusiastically seek contributions from European authors and on European projects. The I/G Applications Track will consist of competitively-selected contributed papers - presented in oral and/or poster form - as well as invited talks. We envision submissions along four sub-areas:
Emerging application and technology papers discuss prototype applications, tools for focused domains or tasks, useful techniques or methods, useful system architectures, scalability enablers, tool evaluations, or integration of KDD and other technologies. Case studies describe deployed projects with measurable benefits that include KDD technology. Such papers need to demonstrate the importance and general impact of the work clearly. Comparative studies compare and contrast KDD technologies using specific examples (without being a product advertisement). Pragmatic issues and considerations include important practical and research considerations, approaches, and architectures that enable successful applications. Submitters are encouraged (but not required) to select one (or more) of these sub-areas for their papers. In their submission, authors are required to explain why the application is important, the specific need for KDD technology to solve the problem (including why other methods perhaps not based on data mining may fall short), and any innovations or lessons learned in the solution. Submission (Expired - No more submissions accepted.)Please submit your paper electronically at the link below. First, sign-up. Then, choose the appropriate track for your paper. https://cmt.research.microsoft.com/KDD2009/
KDD 2009 will also feature keynote presentations, a research track, workshops, tutorials, and the KDD Cup competition. I/G Applications Track Co-Chairs
|
|