Accepted Papers


A Concept-based Model for Enhancing Text Categorization - Shady Shehata, Fakhri Karray, and Mohamed Kamel

A Fast Algorithm for Finding Frequent Episodes in Event Streams - Srivatsan Laxman, Sastry P. S., and Unnikrishnan K. P.

A Framework For Community Identification in Dynamic Social Networks - Chayant Tantipathananandh, Tanya Y. Berger-Wolf, and David Kempe

A Framework for Simultaneous Co-clustering and Learning from Complex Data - Meghana Deodhar and Joydeep Ghosh

A Learning Framework using Green's Function and Kernel Regularization with Application to Recommender System - Chris Ding, Rong Jin, Tao Li, and Horst Simon

A Probabilistic Framework for Relational Clustering - Bo Long, Zhongfei Zhang, and Philip S. Yu

A Scalable Modular Convex Solver for Regularized Risk Minimization - Quoc Le, Alex Smola, Choon Hui Teo, and Vishwanathan S V N

A Spectral Clustering Approach to Optimally Combining Numerical Vectors with a Modular Network - Motoki Shiga, Ichigaku Takigawa, and Hiroshi Mamitsuka

Active Exploration for Learning Rankings from Clickthrough Data - Filip Radlinski and Thorsten Joachims

Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing - Seung-Taek Park and David Pennock

Association Analysis-based Transformations for Protein Interaction Networks: A Function Prediction Case Study - Gaurav Pandey, Michael Steinbach, Rohit Gupta, Tushar Garg, and Vipin Kumar

Automatic Labeling of Multinomial Topic Models - Qiaozhu Mei, Xuehua Shen,and ChengXiang Zhai

BoostCluster: Boosting Clustering by Pairwise Constraints - Yi Liu, Rong Jin, Anil Jain, and Pavan Mallapragada

Canonicalization of Database Records using Adaptive Similarity Measures - Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli, and Andrew McCallum

Characterising the Difference - Jilles Vreeken, Matthijs van Leeuwen, and Arno Siebes

Co-clustering based Classification for Out-of-domain Documents - Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu

Constraint-Driven Clustering - Rong Ge, Martin Ester, Wen Jin, and Ian Davidson

Content-based Document Routing and Index Partitioning for Scalable Similarity based Searches in a Large Corpus - Deepavali Bhagwat, Kave Eshghi, and Pankaj Mehra

Correlation Search in Graph Databases - Yiping Ke, James Cheng, and Wilfred Ng

Cost-effective Outbreak Detection in Networks - Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance

Cross-language information retrieval using PARAFAC2 - Peter Chew, Brett Bader,Tamara Kolda, and Ahmed Abdelali

Density-Based Clustering of Real-Time Stream Data - Yixin Chen and Li Tu

Detecting Anomalous Records in Categorical Datasets - Kaustav Das and Jeff Schneider

Detecting Motifs Under Uniform Scaling - Dragomir Yankov, Eamonn Keogh, Jose Medina, Bill Chiu, and Victor Zordan

Detecting research topics via the correlation between graphs and texts - Yookyung Jo, Carl Lagoze, and C. Lee Giles

Development of NeuroElectroMagnetic Ontologies (NEMO): A Framework for Mining Brain Wave Ontologies - Dejing Dou, Gwen Frishkoff, Jiawei Rong, Robert Frank, Allen Malony, and Don Tucker

Discovering the Hidden Structure of House Prices with a Non-Parametric Latent Manifold Model - Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, and Yann LeCun

Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis - Frizo Janssens, Wolfgang Glnzel, and Bart De Moor

Efficient and Effective Explanation of Change in Hierarchical Summaries - Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos, Flip Korn, Divesh Srivastava, and Neal Young

Efficient Incremental Clustering with Constraints - Ian Davidson, S.S. Ravi, and Martin Ester

Efficient Mining of Iterative Patterns for Software Specification Discovery - David Lo, Siau-Cheng Khoo, and Chao Liu

SCAN: A Structural Clustering Algorithm for Networks - Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, and Thomas A. J. Schweiger

Enhanced Max Margin Learning on Multimodal Data Mining in a Multimedia Database - Zhen Guo, Zhongfei Zhang, Eric Xing, and Christos Faloutsos

Enhancing Semi-Supervised Clustering: A Feature Projection Perspective - Wei Tang, Hui Xiong, Shi Zhong, and Jie Wu

Estimating Rates of Rare Events at Multiple Resolutions - Deepak Agarwal, Andrei Broder, Deepayan Chakrabarti, Dejan Diklic, Vanja Josifovski, and Mayssam Sayyadian

Evolutionary Spectral Clustering by Incorporating Temporal Smoothness - Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, and Belle Tseng

Expertise modeling for matching papers with reviewers - David Mimno and Andrew McCallum

Exploiting Duality in Summarization with Deterministic Guarantees - Panagiotis Karras, Dimitris Sacharidis, and Nikos Mamoulis

Exploiting Underrepresented Query Aspects for Automatic Query Expansion - Daniel Crabtree, Peter Andreae, and Xiaoying Gao

Extracting Semantic Relations from Query Logs - Ricardo Baeza-Yates and Alessandro Tiberi

Fast Best-Effort Pattern Matching in Large Attributed Graphs - Hanghang Tong, Brian Gallagher, Christos Faloutsos, and Tina Eliassi-Rad

Fast Direction-Aware Proximity for Graph Mining - Hanghang Tong, Yehuda Koren, and Christos Faloutsos

Feature Selection Methods for Text Classification - Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, and Michael Mahoney

Finding low-entropy sets and trees from binary data - Hannes Heikinheimo, Eino Hinkkanen, Heikki Mannila, Taneli Mielikinen, and Jouni Seppnen

Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns - Lisa Friedland and David Jensen

From frequent itemsets to semantically meaningful visual patterns - Junsong Yuan,Ying Wu, and Ming Yang

Generalized Component Analysis for Text with Heterogeneous Attributes - Xuerui Wang, Chris Pal, and Andrew McCallum

Hierarchical Mixture Models: a Probabilistic Analysis - Mark Sandler

Information distance from a question to an answer - Xian Zhang, Yu Hao, Xiaoyan Zhu, and Ming Li

Information Genealogy: Uncovering the Flow of Ideas in Non-Hyperlinked Document Databases - Benyah Shaparenko and Thorsten Joachims

Joint Cluster Analysis of Attribute and Relationship Data Without Priori Specification of the Number of Clusters - Flavia Moser, Rong Ge, and Martin Ester

Joint Optimization of Wrapper Generation and Template Detection - Shuyi Zheng, Ruihua Song, Di Wu, and Ji-Rong Wen

Knowledge Discovery of Multiple-topic Document using Parametric Mixture Model with Dirichlet Prior - Issei Sato and Hiroshi Nakagawa

Learning the Kernel Matrix in Discriminant Analysis via Quadratically Constrained Quadratic Programming - Jieping Ye, Shuiwang Ji, and Jianhui Chen

Local Decomposition for Rare Class Analysis - Junjie Wu, Hui Xiong, Peng Wu, and Jian Chen

Making Generative Classifiers Robust to Selection Bias - Andrew Smith and Charles Elkan

Mining Correlated Bursty Topic Patterns from Coordinated Text Streams - Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat

Mining Favorable Facets - Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, and Ke Wang

Mining Optimal Decision Trees from Itemset Lattices - Siegfried Nijssen and Elisa Fromont

Mining Statistically Important Equivalence Classes - Jinyan Li, Guimei Liu, and Limsoon Wong

Mining Templates from Search Result Records of Search Engines - Hongkun Zhao, Weiyi Meng, and Clement Yu

Modeling Relationships at Multiple Scales to Improve Accuracy of Large Recommender Systems - Robert Bell, Yehuda Koren, and Chris Volinsky

Model-Shared Subspace Boosting for Multi-label Classification - Rong Yan, Jelena Tesic, and John Smith

Multiscale Topic Tomography - Ramesh Nallapati, William W. Cohen, Susan Ditmore, John Lafferty, and Kin Ung

Nestedness and segmented nestedness - Heikki Mannila and Evimaria Terzi

Nonlinear Adaptive Distance Metric Learning for Clustering - Jianhui Chen, Zheng Zhao, Jieping Ye, and Huan Liu

On String Classification in Data Streams - Charu Aggarwal and Philip S. Yu

Parameter-free Mining of Large Time-evolving Graphs - Jimeng Sun, Spiros Papadimitriou, Philip S. Yu, and Christos Faloutsos

Partial Example Acquisition in Cost-Sensitive Learning - Victor S. Sheng and Charles X. Ling

Practical Learning from One Sided Feedback - D. Sculley

Predictive Discrete Latent Factor Models for Large Scale Dyadic Data - Deepak Agarwal and Srujana Merugu

Privacy-Preservation for Gradient Descent Methods - Li Wan, Wee Keong Ng, Shuguo Han, and Vincent Lee

Real-time Ranking with Concept Drift Using Expert Advice - Hila Becker and Marta Arias

Scalable Look-Ahead Linear Regression Trees - David Vogel, Ognian Asparouhov, and Tobias Scheffer

Semi-Supervised Classification with Hybrid Generative/Discriminative Methods - Gregory Druck, Chris Pal, Xiaojin Zhu, and Andrew McCallum

Show me the money! Deriving the Pricing Power of Product Features by Mining Consumer Reviews - Nikolay Archak, Anindya Ghose, and Panagiotis Ipeirotis

Statistical Change Detection for Multi-Dimensional Data - Xiuyao Song, Mingxi Wu, Chris Jermaine, and Sanjay Ranka

Stochastic Processes and Temporal Data Mining - Paul Cotofrei and Kilian Stoffel

Structural and Temporal Analysis of the Blogosphere Through Community Factorization - Yun Chi, Shenghuo Zhu, Xiaodan Song, Junichi Tatemura, and Belle Tseng

Support Feature Machine for Classification of Abnormal Brain Activity - W. Art Chaovalitwongse, Ya-Ju Fan, and Rajesh Sachdeo

Temporal Causal Modeling with Graphical Granger Methods - Andrew Arnold, Yan Liu, and Naoki Abe

The Minimum Consistent Subset Cover Problem and its Applications in Data Mining - Byron Gao, Martin Ester, Oliver Schulte, and Hui Xiong

Time-Dependent Event Hierarchy Construction - Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Huan Liu, and Philip S. Yu

Tracking Multiple Topics for Finding Interesting Articles - Raymond Pon, Alfonso Cardenas, David Buttler, and Terence Critchlow

Trajectory Pattern Mining - Fosca Giannotti, Mirco Nanni, Dino Pedreschi, and Fabio Pinelli

Upping the Baseline for High-Precision Text Classifiers - Aleksander Kolcz and Wen-Tau Yih

Use of Ranked Cross Document Evidence Trails for Hypothesis Generation - Rohini Srihari, Li Xu, and Tushar Saxena

Using Hierarchical Clustering for Learning - Vincent Schickel and Boi Faltings

Very Sparse Stable Random Projections for Dimension Reduction in ($0<\alpha\leq 2$) Norm - Ping Li

Webpage Understanding: an Integrated Approach - Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, and Hsiao-Wuen Hon

Weighting versus Pruning in Rule Validation for Detecting Network and Host Anomalies - Gaurav Tandon and Philip Chan

XProj: A Framework for Projected Structural Clustering of XML Documents - Charu Aggarwal, Na Ta, Jianyong Wang, Jianhua Feng, and Mohammed Zaki


Extracting Relevant Named Entities for Automated Expense Reimbursement - Guangyu Zhu, Timothy Bethea, and Vikas Krishna

Cleaning Disguised Missing Data: A Heuristic Approach - Ming Hua, Jian Pei

Distributed Classification in Peer-to-Peer Networks - Ping Luo, Hui Xiong

Corroborate and Learn Facts from the Web - Shubin Zhao, Jonathan Betz

iLink: Search and Routing in Social Networks - Jeffrey Davitz, Jiye Yu, Sugato Basu, David Gutelius, Alexandra Harris

Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO - Ron Kohavi, Randal M Henne, Dan Sommerfield

Relational Data Pre-Processing Techniques for Improved Securities Fraud Detection - Andrew Fast, Lisa Friedland, Marc Maier, Brian Taylor, David Jensen, Henry Goldberg, John Komoroske

An Event-based Framework for Characterizing the Evolutionary Behavior of Interaction Graphs - Sitaram Asur, Srinivasan Parthasarathy, Duygu Ucar

High Quantile Modeling for Customer Wallet Estimation with Other Applications - Claudia Perlich, Saharon Rosset, Richard Lawrence, and Bianca Zadrozny

Mining complex power networks for blackout prevention - JunHua Zhao, ZhaoYang Dong, Pei Zhang

On-board Analysis of Uncalibrated Data for a Spacecraft at Mars - Rebecca Castano, Kiri Wagstaff, Steve Chien, Timothy Stough, Benyang Tang

Short Papers

Domain-Constrained Semi-Supervised Mining of Tracking Models in Sensor Networks - Rong Pan, Junhui Zhao, Wenchen Zheng, Jeffrey Junfeng Pan, Dou Shen, Jialin Pan, Qiang Yang

Framework for Classification and Segmentation of Massive Audio Data Streams - Charu Aggarwal

LungCAD: A Clinically Approved, Machine Learning System for Lung Cancer Detection - R Bharat Rao, Jinbo Bi, Glenn Fung, Marcos Salganicoff, Nancy Obuchowski, David Naidich

Truth Discovery with Multiple Conflicting Information Providers on the Web - Xiaoxin Yin, Jiawei Han, and Philip S. Yu

Detecting Changes in Large Data Sets of Payments Cards Data: A Case Study - Robert Grossman, Joseph Bugajski, Chris Curry, David Locke, and Steve Vejcik

Event Summarization for System Management - Wei Peng, Charles Perng, Tao Li, and Haixun Wang

Machine Learning for Stock Selection - Robert Yan and Charles X. Ling

IMDS: Intelligent Malware Detection System - Yanfang Ye, Dingding Wang, Tao Li, Dongyi Ye

Links