ACM SIGKDD

KDD-2002: Accepted Papers



308 paper submissions (35 industrial track)13 workshop proposals14 tutorial proposals
44 full papers and 44 posters accepted6 workshops accepted6 tutorials accepted

Full Research Papers:

Bayesian analysis of massive datasets via particle filters
Greg Ridgeway, David Madigan

Scalable Robust Covariance and Correlation Estimates for Data Mining
Fatemah A. ALqallaf, Kjell P. Konis, R. Douglas Martin, Ruben H. Zamar

MARK: A Boosting Algorithm for Heterogeneous Kernel Models
Kristin P. Bennett, Michinari Momma, Mark J. Embrechts

Selecting the Right Interestingness Measure for Association Patterns
Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava

DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints
Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker White

Querying Multiple Sets of Discovered Rules
Alexander Tuzhilin, Bing Liu

Mining Knowledge-Sharing Sites for Viral Marketing
Matthew Richardson, Pedro Domingos

Efficiently Mining Frequent Trees in a Forest
Mohammed Zaki

ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs
Christopher R. Palmer, Phillip B. Gibbons, Christos Faloutsos

Bursty and Hierarchical Structure in Streams
Jon Kleinberg

On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Eamonn Keogh, Shruti Kasetty

Query, Analysis, and Visualization of Hierarchically Structured Data using Polaris
Christopher Stolte, Diane Tang, Pat Hanrahan

On Interactive Visualization of high-dimensional Data using the Hyperbolic Plane
Joerg Walter, Helge Britter

Optimizing Search Engines Using Clickthrough Data
Thorsten Joachims

Relational Markov Models and their Application to Adaptive Web Navigation
Corin R. Anderson, Pedro Domingos, Daniel S. Weld

Pattern Discovery in Sequences under a Markov Assumption
Darya Chudova, Padhraic Smyth

On Effective Classification of Strings with Wavelets
Charu C. Aggarwal

Shrinkage Estimator Generalizations of Proximal Support Vector Machines
Deepak K. Agarwal

Hierarchical Model-Based Clustering of Large Datasets Through Fractionation and Refractionation
Jeremy Tantrum, Alejandro Murua, Werner Stuetzle

Enhanced Word Clustering for Hierarchical Text Classification
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Kumar

A Parallel Learning Algorithm for Text Classification
Canasai Kuengkrai, Chuleerat Jaruskulchai

A Refinement Approach to Handling Model Misfit in Text Categorization
Haoran Wu, Tong Heng Phang, Bing Liu, Xiaoli Li

Privacy Preserving Mining of Association Rules
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, Johannes Gehrke

Mining Frequent Item Sets by Opportunistic Projection
Junqiang Liu, Yunhe Pan, Ke Wang, Jiawei Han

PEBL: Positive Example Based Learning for Web Page Classification Using SVM
Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang

Web Site Mining: A new way to spot Competitors, Customers and Suppliers in the World Wide Web
Martin Ester, Hans-Peter Kriegel, Matthias Schubert

Sequential Cost-Sensitive Decision Making with Reinforcement Learning
Edwin Pednault, Naoki Abe, Bianca Zadrozny

Interactive Deduplication using Active Learning
Sunita Sarawagi, Anuradha Bhamidipaty

Transforming Data to Satisfy Privacy Constraints
Vijay S. Iyengar

Exploiting Unlabeled Data in Ensemble Methods
Kristin P. Bennett, Ayhan Demiriz, Richard Maclin

Predicting Rare Classes: Can Boosting Make Any Weak Learner Strong?
Mahesh V. Joshi, Ramesh C. Agarwal, Vipin Kumar

Efficient Handling of High-Dimensional Feature Spaces by Randomized Classifier Ensembles
Aleksander Kolcz, Xiaomei Sun, Jugal Kalita

Full Industrial/Application Papers:

From Run-time Behavior to Usage Scenarios: An Interaction-pattern Mining Approach
Mohammad El-Ramly, Eleni Stroulia, Paul Sorenson

Exploiting Response Models - Optimizing Cross-Sell and Up-Sell Opportunities in Banking
Andrew Storey, Marc-david Cohen

Customer Lifetime Value Modeling and Its Use for Customer Retention Planning
Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Yizhak Idan

Mining Product Reputations on the Web
Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, Toshikazu Fukushima

Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification
Sheila Tejada, Craig A. Knoblock, Steven Minton

A System for Real-time Competitive Market Intelligence
Sholom M. Weiss, Naval K. Verma

Mining Intrusion Detection Alarms for Actionable Knowledge
Klaus Julisch, Marc Dacier

Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks
Matthew V. Mahoney, Philip K. Chan

ADMIT: Anomaly-based Data Mining for Intrusions
Karlton Sequeira, Mohammed Zaki

Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data

Alexander Tuzhilin, Gediminas Adomavicius

On the potential of domain literature for clustering and for Bayesian network learning
Peter Antal, Patrick Glenisson, Geert Fannes

Mining Heterogeneous Gene Expression with Time Lagged Recurrent Neural Networks
Yulan Liang, Arpad Kelemen

Poster Papers:

Collaborative Crawling: Mining User Experiences for Topical Resource Discovery
Charu C. Aggarwal

Sequential PAttern Mining Using Bitmap Representation
Jay Ayres, Jason Flannick, Johannes Gehrke, Tomi Yiu

Frequent Term-Based Text Clustering
Florian Beil, Martin Ester, Xiaowei Xu

A Theoretical Framework for Learning from a Pool of Disparate Data Sources
Shai Ben-David, Johannes Gehrke, Reba Schuller

Topics in 0-1 data
Ella Bingham, Heikki Mannila, Jouni K. Seppanen

Extracting Decision Trees From Trained Neural Networks
Olcay Boz

A New Two-Phase Sampling Based Algorithm for Discovering Association Rules
Bin Chen, Peter Haas, Peter Scheuermann

CVS: A Correlation-Verification Based Smoothing Technique on Information Retrieval and Term Clustering
Christina Yip Chung, Bin Chen

Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration
William W. Cohen, Jacob Richman

SECRET: A Scalable Linear Regression Tree Algorithm
Alin Dobra, Johannes Gehrke

Statistical Modeling of Large-Scale Simulation Data
Tina Eliassi-Rad, Terence Critchlow, Ghaleb Abdulla

Tumor Cell Identification using Features Rules
Bin Fang, Wynne Hsu, Mong Li Lee

Integrating Feature and Instance Selection for Text Classification
Dimitris Fragoudis, Dimitris Meretakis, Spiros Likothanassis

SyMP: An Efficient Clustering Approach to Identify Clusters of Arbitrary Shapes in Large Data Sets
Hichem Frigui

Scaling multi-class Support Vector Machines using inter-class confusion
Shantanu Godbole, Sunita Sarawagi, Soumen Chakrabarti

Visualization Support for an User-Centered KDD Process
TuBao Ho, TrongDung Nguyen, DungDuc Nguyen

Mining Complex Models from Arbitrarily Large Databases in Constant Time
Geoff Hulten, Pedro Domingos

A Model for Discovering Customer Value for E-Content
Srinivasan Jagannathan, Jayanth Nayak, Kevin Almeroth, Markus Hofmann

SimRank: A Measure of Structural-Context Similarity
Glen Jeh, Jennifer Widom

Similarity Measure Based on Partial Information of Time series
Xiaoming Jin, Yuchang Lu, Chunyi Shi

Finding Surprising Patterns in a Time Series Database In Linear Time and Space
Eamonn Keogh, Stefano Lonardi, Bill Yuan-chi Chiu

Clustering Seasonality Patterns in the Presence of Errors
Mahesh Kumar, Nitin R. Patel, Jonathan Woo

Construct robust rule sets for classification
Jiuyong Li, Rodney Topor, Hong Shen

Instability of Decision Tree Classification Algorithms
Ruey-Hsia Li, Geneva G. Belford

Distributed Data Mining in a Chain Store Database of Short Transactions
Cheng-Ru Lin, Chang-Hung Lee, Ming-Syan Chen, Philip S. Yu

A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging
Cheng-Ru Lin, Ming-Syan Chen

Discovering Informative Content Blocks from Web Documents
Shian-Hua Lin, Jan-Ming Ho

Collusion in The U.S. Crop Insurance Program: Applied Data Mining
Bert B. Little, Walter L. Johnston, Ashley C. Lovell, Roderick M. Rejesus, Steve A. Steed

Incremental Context Mining for Adaptive Document Classification
Rey-Long Liu, Yun-Ling Lu

Evaluating Classifiers' Performance in a Constrained Environment
Anna Olecka

Discovering Word Senses from Text
Patrick Pantel, Dekang Lin

Combining Clustering and Co-training to Enhance Text Classification Using Unlabelled Data
Bhavani Raskutti, Herman Ferra, Adam Kowalczyk,

Single-shot Detection of Multiple Categories of Text using Parametric Mixture Models
Naonori Ueda, Kazumi Saito

What's the Code? Automatic Classification of Source Code Archives
Secil Ugurel, Robert Krovetz, Lee Giles, David M. Pennock, Eric J. Glover, Hongyuan Zha

Privacy Preserving Association Rule Mining in Vertically Partitioned Data
Jaideep Vaidya, Chris Clifton

Non-Linear Dimensionality Reduction Techniques for Classification and Visualization
Michail Vlachos, Carlotta Domeniconi, Dimitris Gunopulos, George Kollios, Nick Koudas

Item Selection By "Hub-Authority" Profit Ranking
Ke Wang, Ming-Yen Thomas Su

Discovery Net: Towards a Grid of Knowledge Discovery
Vasa Curcin, Moustafa Ghanem, Yike Guo, Martin Kohler,Anthony Rowe, Jameel Syed, Patrick Wendel

Making every bit count: Fast nonlinear axis scaling
Leejay Wu, Christos Faloutsos

B-EM: A Classifier Incorporating Bootstrap with EM Approach for Data Mining
Xintao Wu, Jianping Fan, Kalpathi R. Subramanian

A Unifying Framework for Outlier Detection and Change Point Detection from Non-stationary Time Series Data
Kenji Yamanishi, Jun-ichi Takeuchi

CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data
Yiling Yang, Xudong Guan, Jinyuan You

Topic-conditioned Novelty Detection
Yiming Yang, Jian Zhang, Jaime Carbonell, Chun Jin

Transforming classifier scores into accurate multiclass probability estimates
Bianca Zadrozny, Charles Elkan

Last modified: June 18th, 2002 by the KDD-2002 Webmaster (zaiane@cs.ualberta.ca)