Research Track Accepted Papers
15. Spectral Domain-Transfer Learning. Xiao Ling, Wenyuan Dai, Gui-Rong
Xue, Qiang Yang, Yong Yu.
35. Learning Classifiers from Only Positive and Unlabeled Data. Charles
Elkan, Keith Noto.
38. Automatic Record Linkage using Seeded Nearest Neighbour and Support Vector Machine
Classification. Peter Christen.
46. SPIRAL: Efficient and Exact Model Identification for Hidden Markov Models.
Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamuro.
50. Microscopic Evolution of Social Networks. Jure Leskovec, Lars Backstrom,
Ravi Kumar, Andrew Tomkins.
52. A Family of Dissimilarity Measures between Nodes Generalizing both the Shortest-Path
and the Commute-time Distances. Luh Yen, Amin Mantrach, Masashi Shimbo,
Marco Saerens.
62. Direct Mining of Discriminative and Essential Frequent Patterns via Model-based
Search Tree. Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei
Han, Philip Yu, Olivier Verscheure.
75. Mining Preferences from Superior and Inferior Examples. Bin Jiang,
Jian Pei, Xuemin Lin, David W. Cheung, Jiawei Han.
89. Structured Metric Learning for High Dimensional Problems. Jason V.
Davis, Inderjit S. Dhillon.
92. Permu-pattern: Discovery of Mutable Permutation Patterns with Proximity Constraint.
Meng Hu, Jiong Yang, Wei Su.
99. Partitioned Logistic Regression for Spam Filtering. Ming-wei Chang,
Wen-tau Yih, Christopher Meek.
105. Finding Non-Redundant, Statistically Significant Regions in High Dimensional
Data: a Novel Approach to Projected and Subspace Clustering. Gabriela Moise,
Jorg Sander.
106. Weighted graphs and disconnected components: Patterns and a generator.
Mary McGlohon, Leman Akoglu, Christos Faloutsos.
125. Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering
Model. Yehuda Koren.
127. Discrimination-aware Data Mining. Dino Pedreschi, Salvatore Ruggieri,
Franco Turini.
140. Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams.
Albert Bifet, Ricard Gavaldà.
142. The Cost of Privacy: Destruction of Data-Mining Utility in Anonymized Data
Publishing. Justin Brickell. Vitaly Shmatikov.
149. Colibri: Fast Mining of Large Static and Dynamic Graphs. Hanghang
Tong. Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos.
153. A Sequential Dual Method for Large Scale Multi-Class Linear SVMs.
S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin.
160. Efficient Computation of Personal Aggregate Queries on Blogs. Ka Cheung
Sia, Junghoo Cho, Yun Chi, Belle L. Tseng.
163. Feedback Effects between Similarity and Social Influence in Online Communities.
David Crandall, Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, Siddharth Suri.
168. CutS3VM: A Fast Semi-Supervised SVM Algorithm. Bin Zhao, Fei Wang,
Changshui Zhang.
169. Probabilistic Latent Semantic Visualization: Topic Model for Visualizing Documents.
Tomoharu Iwata, Takeshi Yamada, Naonori Ueda.
181. Structured Entity Identification and Document Categorization: Two Tasks with
One Joint Model . Indrajit Bhattacharya, Shantanu Godbole, Sachindra Joshi.
220. Categorizing and Mining Concept Drifting Data Streams. Peng Zhang,
Xingquan Zhu, Yong Shi.
251. Efficient Semi-streaming Algorithms for Local Triangle Counting in Massive
Graphs. Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis.
269. Angle-Based Outlier Detection in High-dimensional Data. Hans-Peter
Kriegel, Matthias Schubert, Arthur Zimek.
276. Efficient Ticket Routing by Resolution Sequence Mining. Qihong Shao,
Yi Chen, Shu Tao, Xifeng Yan, Nikos Anerousis.
277. Building Semantic Kernels for Text Classification using Wikipedia.
Pu Wang, Carlotta Domeniconi.
289. Unsupervised Deduplication using Cross-Field Dependencies. Robert
Hall, Charles Sutton, Andrew Mccallum.
290. Interpretable Nonnegative Matrix Decompositions. Saara Hyvönen, Pauli
Miettinen, Evimaria Terzi.
291. Constraint Programming for Itemset Mining. Luc De Raedt, Tias Guns,
Siegfried Nijssen.
296. On Updates that Constrain the Features' Connections During Learning.
Omid Madani, Jian Huang.
305. FastANOVA: an Efficient Algorithm for Genome-Wide Association Study.
Xiang Zhang, Fei Zou, Wei Wang.
307. Fast Logistic Regression for Text Categorization with Variable-Length N-grams.
Georgiana Ifrim, Goekhan Bakir, Gerhard Weikum.
318. Bridging Centrality: Graph Mining from Element Level to Group Level.
Woochang Hwang, Taehyong Kim, Murali Ramanathan, Aidong Zhang.
320. Banded Structure in Binary Matrices. Gemma C. Garriga, Esa Junttila,
Heikki Mannila.
325. Model-Based Document Clustering with a Collapsed Gibbs Sampler. Daniel
David Walker, Eric K. Ringger.
335. Constructing Comprehensive Summaries of Large Event Sequences. Jerry
Kiernan, Evimaria Terzi.
340. A Bayesian Mixture Model with Linear Regression Mixing Proportions.
Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums.
342. Training Structural SVMs with Kernels Using Sampled Cuts. Chun-Nam
John Yu, Thorsten Joachims.
347. Using Ghost Edges for Classification in Sparsely Labeled Networks.
Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, Christos Faloutsos.
349. Mobile Call Graphs: Beyond Power-Law and Lognormal Distributions.
Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos,
Jure Leskovec.
362. Stable Feature Selection via Dense Feature Groups. Lei Yu, Chris Ding,
Steven Loscalzo.
372. Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy
Labelers. Victor Sheng, Foster Provost, Panagiotis G. Ipeirotis.
378. Fast Collapsed Gibbs Sampling For Latent Dirichlet Allocation. Ian
Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, Max Welling.
388. iSAX: Indexing and Mining Terabyte Sized Time Series. Jin Shieh, Eamonn
Keogh.
389. Active Learning with Direct Query Construction. Charles X. Ling, Jun
Du.
394. Local Peculiarity Factor and its Application in Outlier Detection.
Jian Yang, Ning Zhong, Yiyu Yao, Jue Wang.
400. Locality Sensitive Hash Functions Based on Concomitant Rank Order Statistics.
Kave Eshghi, Shyamsundar Rajaram.
401. Composition Attacks and Auxiliary Information in Data Privacy. Srivatsava
Ranjit Ganta, Shiva Prasad Kasiviswanathan, Adam Smith.
402. Scaling Up Text Classification for Large File Systems. George Forman,
Shyamsundar Rajaram.
404. Entity Categorization over Large Document Collections. Venkatesh Ganti,
Arnd C. König, Rares Vernica.
413. The Structure of Information Pathways in Social Communication Networks.
Gueorgi Kossinets, Jon Kleinberg, Duncan Watts.
420. SAIL: Summation-based Incremental Learning for Information-Theoretic Clustering.
Junjie Wu, Hui Xiong, Jian Chen.
426. Stream Prediction Using a Generative Model Based on Frequent Episodes in Event
Sequences. Srivatsan Laxman, Vikram Tankasali, Ryen W. White.
434. Knowledge Transfer via Multiple Model Local Structure Mapping. Jing
Gao, Wei Fan, Jing Jiang, Jiawei Han.
439. Relational Learning via Collective Matrix Factorization. Ajit P. Singh,
Geoffrey J. Gordon.
440. Classification with Partial Labels. Nam Nguyen, Rich Caruana.
442. Volatile Correlation Computation: A Checkpoint View. Wenjun Zhou,
Hui Xiong.
455. Anonymizing Transaction Databases for Publication. Yabo Xu, Ke Wang,
Ada Wai-Chee Fu, Philip S. Yu.
456. Can Complex Network Metrics Predict the Behavior of NBA Teams?. Pedro
O.S. Vaz de Melo, Virgilio A.F. Almeida, Antonio A.F. Loureiro.
460. Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on
SMPs. Lei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos Faloutsos.
463. Information Extraction from Wikipedia: Moving Down the Long Tail.
Fei Wu, Raphael Hoffmann, Daniel S. Weld.
469. Bypass Rates: Reducing Query Abandonment using Negative Inferences.
Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong.
472. Community Evolution in Dynamic Multi-Mode Networks. Lei Tang, Huan
Liu, Jianping Zhang, Zohreh Nazeri.
496. Knowledge Discovery of Semantic Relationships between Words Using Nonparametric
Bayesian Graph Model. Issei Sato, Minoru Yoshida, Hiroshi Nakagawa.
518. Reconstructing Chemical Reaction Networks: Data Mining meets System Identification.
Yong Ju Cho, Naren Ramakrishnan, Yang Cao.
537. Unsupervised Feature Selection for Principal Components Analysis.
Christos Boutsidis, Michael W. Mahoney, Petros Drineas.
548. Effective Label Acquisition for Collective Classification. Mustafa
Bilgic, Lise Getoor.
554. A Semi-Supervised Approach to Rapid and Reliable Labeling of Large Data Sets
. Gyorgy J. Simon, Vipin Kumar, Zhi-Li Zhang, Francesco Bonchi.
563. Data Mining Using High Performance Data Clouds: Experimental Studies Using
Sector and Sphere. Robert Grossman, Yunhong Gu.
569. Topical Query Decomposition. Francesco Bonchi, Carlos Castillo, Debora
Donato, Aristides Gionis.
571. Automatic Identification of Quasi-experimental Designs for Discovering Causal
Knowledge. David D. Jensen, Andrew S. Fast, Brian J. Taylor, Marc E. Maier.
576. Identifying Biologically Relevant Genes via Multiple Heterogeneous Data Sources.
Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Yung Chang.
577. Anomaly Pattern Detection in Categorical Datasets. Kaustav Das, Jeff
Schneider, Daniel B. Neill.
594. Semi-supervised Learning with Data Calibration for Long-Term Time Series Forecasting.
Haibin Cheng, Pang-Ning Tan.
611. De-duping URLs via Rewrite Rules. Anirban Dasgupta, Ravi Kumar, Amit
Sasturkar.
613. FAST: A ROC-based Feature Selection Metric for Small Samples and Imbalanced
Data Classification Problems. Xue-wen Chen, Mike Wasikowski.
623. Asymmetric Support Vector Machines: Low False-Positive Learning Under the User
Tolerance. Shan-Hung Wu, Keng-Pei Lin, Chung-Min Chen, Ming-Syan Chen.
632. Quantitative Evaluation of Approximate Frequent Pattern Mining Algorithms.
Rohit Gupta, Gang Fang, Blayne Field, Michael Steinbach, Vipin Kumar.
672. Partial Least Squares Regression for Graph Mining. Hiroto Saigo, Nicole
Krämer, Koji Tsuda.
681. Generating Succinct Titles for Web URLs. Deepayan Chakrabarti, Ravi
Kumar, Kunal Punera.
685. Succinct Summarization of Transactional Databases: An Overlapped Hyperrectangle
Scheme.Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan.
686. Influence and Correlation in Social Networks. Aris Anagnostopoulos,
Ravi Kumar, Mohammad Mahdian.
692. Extracting Shared Subspace for Multi-label Classification. Shuiwang
Ji, Lei Tang, Shipeng Yu, Jieping Ye.
695. Effective and Efficient Itemset Pattern Summarization: Regression-based Approaches.
Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan.
702. Learning Subspace Kernels for Classification. Jianhui Chen, Shuiwang
Ji, Betul Ceran, Qi Li, Mingrui Wu, Jieping Ye.
750. Mining Multi-Faceted Overviews of Arbitrary Topics in a Text Collection.
Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce Schatz.
751. Joint Latent Topic Models for Text and Citations. Ramesh M. Nallapati,
Amr Ahmed, Eric P. Xing, William W. Cohen.
758. Hypergraph Spectral Learning for Multi-label Classification. Liang
Sun, Shuiwang Ji, Jieping Ye.
769. Simultaneous Tensor Subspace Selection and Clustering: The Equivalence of High
Order SVD and K-Means Clustering. Heng Huang, Chris Ding, Dijun Luo.
773. A Unified Approach for Schema Matching, Coreference, and Canonicalization.
Michael L. Wick, Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum.
787. Multi-class Cost-sensitive Boosting with p-norm Loss Functions. Aurelie C.
Lozano, Naoki Abe.
836. Combinational Collaborative Filtering for Personalized Community Recommendation.
Wen-Yen Chen, Dong Zhang, Edward Y. Chang.
850. Structured Learning for Non-Smooth Ranking Losses. Rajiv Khanna, Uma
Sawant, Soumen Chakrabarti, Chiru Bhattacharyya.
Industrial/Government Applications Track Accepted Papers
65. Privacy Leaks Using Corpus-Based Association Rules. Richard Chow, Philippe
Golle, Jessica Staddon.
80. TagMark: Reliable Estimations of RFID Tags for Business Processes.
Leonardo Weiss Ferreira Chaves, Erik Buchmann, Klemens Böhm.
124. Spotting Out Emerging Artists Using Geo-Aware Analysis of P2P Query Strings.
Noam Koenigstein, Yuval Shavitt, Tomer Tankel
128. Identifying Authoritative Actors in Question-Answering Forums - The Case of
Yahoo! Answers. Mohamed Bouguessa, Benoit Dumoulin, Shengrui Wang.
178. Text Classification, Business Intelligence, and Interactivity: Automating C-Sat
Analysis for Services Industry. Shantanu Godbole, Shourya Roy.
183. Context-Aware Query Suggestion by Mining Click-Through and Session Data.
Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, Hang Li.
221. Identifying Domain Expertise of Developers from Source Code. Renuka
Sindhgatta.
265. Temporal Pattern Discovery for Trends and Transient Effects: Its Application
to Patient Records. G. Niklas Norén, Andrew Bate, Johan Hopstadius, Kristina
Star, I. Ralph Edwards.
328. Anticipating Annotations and Emerging Trends in Biomedical Literature.
Fabian Moerchen, Mathaeus Dejori, Dmitryi Fradkin, Julien Etienne, Bernd Wachmann,
Markus Bundschus.
330. A Visual-Analytic Toolkit for Dynamic Interaction Graphs. Xintian
Yang, Sitaram Asur, Srinivasan Parthasarathy, Sameep Mehta.
337. Using Predictive Analysis to Improve Invoice-to-Cash Collection. Sai
Zeng, Prem Melville, Christian A. Lang, Ioana Boier-Martin, Conrad Murphy.
368. Automated Cyclone Discovery and Tracking using Knowledge Sharing in Multiple
Heterogeneous Satellite Data. Shen-Shyang Ho, Ashit Talukder.
391. Land Cover Change Detection: A Case Study. Shyam Boriah, Vipin Kumar,
Michael Steinbach, Christopher Potter, Steven Klooster.
435. ArnetMiner: Extraction and Mining of Academic Social Networks . Jie
Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, Zhong Su.
466. Learning Methods for Lung Tumor Markerless Gating in Image-Guided Radiotherapy.
Ying Cui, Jennifer G. Dy, Gregory C. Sharp, Brian M. Alexander, Steve B. Jiang.
563. Data Mining Using High Performance Data Clouds: Experimental Studies Using
Sector and Sphere. Robert Grossman, Yunhong Gu.
593. Learning from Multi-Topic Web Documents for Contextual Advertising.
Yi Zhang, Arun C. Surendran, John C. Platt, Mukund Narasimhan.
625. Heterogeneous Data Fusion for Alzheimer's Disease Study. Jieping Ye,
Kewei Chen, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Patel, Min Bae, Ravi Janardan,
Huan Liu, Gene Alexander, Eric Reiman.
649. Scalable and Near Real-Time Burst Detection from eCommerce Queries.
Nish Parikh, Neel Sundaresan.
650. Privacy-Preserving Cox Regression for Survival Analysis. Shipeng Yu,
Glenn Fung, Romer Rosales, Sriram Krishnan, R. Bharat Rao, Cary Dehing-Oberije,
Philippe Lambin.
688. Customer Targeting Models Using Actively-Selected Web Content. Prem
Melville, Saharon Rosset, Richard D. Lawrence.
789. The Persuasive Phase of Visualization . Christine H. Chih, Douglass
S. Parker.
806. Experimental Comparison of Scalable Online Ad Serving . Gang Wu, Brendan
Kitts.
|