KDD-2002: Accepted Papers
| 308 paper submissions (35 industrial track) | 13 workshop proposals | 14 tutorial proposals |
| 44 full papers and 44 posters accepted | 6 workshops accepted | 6 tutorials accepted |
Full Research Papers:
- Bayesian analysis of massive datasets via particle filters
- Greg Ridgeway, David Madigan
- Scalable Robust Covariance and Correlation Estimates for Data Mining
- Fatemah A. ALqallaf, Kjell P. Konis, R. Douglas Martin, Ruben H. Zamar
- MARK: A Boosting Algorithm for Heterogeneous Kernel Models
- Kristin P. Bennett, Michinari Momma, Mark J. Embrechts
- Selecting the Right Interestingness Measure for Association Patterns
- Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava
- DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints
- Cristian Bucila, Johannes Gehrke, Daniel Kifer, Walker White
- Querying Multiple Sets of Discovered Rules
- Alexander Tuzhilin, Bing Liu
- Mining Knowledge-Sharing Sites for Viral Marketing
- Matthew Richardson, Pedro Domingos
- Efficiently Mining Frequent Trees in a Forest
- Mohammed Zaki
- ANF: A Fast and Scalable Tool for Data Mining in Massive Graphs
- Christopher R. Palmer, Phillip B. Gibbons, Christos Faloutsos
- Bursty and Hierarchical Structure in Streams
- Jon Kleinberg
- On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
- Eamonn Keogh, Shruti Kasetty
- Query, Analysis, and Visualization of Hierarchically Structured Data using Polaris
- Christopher Stolte, Diane Tang, Pat Hanrahan
- On Interactive Visualization of high-dimensional Data using the Hyperbolic Plane
- Joerg Walter, Helge Britter
- Optimizing Search Engines Using Clickthrough Data
- Thorsten Joachims
- Relational Markov Models and their Application to Adaptive Web Navigation
- Corin R. Anderson, Pedro Domingos, Daniel S. Weld
- Pattern Discovery in Sequences under a Markov Assumption
- Darya Chudova, Padhraic Smyth
- On Effective Classification of Strings with Wavelets
- Charu C. Aggarwal
- Shrinkage Estimator Generalizations of Proximal Support Vector Machines
- Deepak K. Agarwal
- Hierarchical Model-Based Clustering of Large Datasets Through Fractionation and Refractionation
- Jeremy Tantrum, Alejandro Murua, Werner Stuetzle
- Enhanced Word Clustering for Hierarchical Text Classification
- Inderjit S. Dhillon, Subramanyam Mallela, Rahul Kumar
- A Parallel Learning Algorithm for Text Classification
- Canasai Kuengkrai, Chuleerat Jaruskulchai
- A Refinement Approach to Handling Model Misfit in Text Categorization
- Haoran Wu, Tong Heng Phang, Bing Liu, Xiaoli Li
- Privacy Preserving Mining of Association Rules
- Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, Johannes Gehrke
- Mining Frequent Item Sets by Opportunistic Projection
- Junqiang Liu, Yunhe Pan, Ke Wang, Jiawei Han
- PEBL: Positive Example Based Learning for Web Page Classification Using SVM
- Hwanjo Yu, Jiawei Han, Kevin Chen-Chuan Chang
- Web Site Mining: A new way to spot Competitors, Customers and Suppliers in the World Wide Web
- Martin Ester, Hans-Peter Kriegel, Matthias Schubert
- Sequential Cost-Sensitive Decision Making with Reinforcement Learning
- Edwin Pednault, Naoki Abe, Bianca Zadrozny
- Interactive Deduplication using Active Learning
- Sunita Sarawagi, Anuradha Bhamidipaty
- Transforming Data to Satisfy Privacy Constraints
- Vijay S. Iyengar
- Exploiting Unlabeled Data in Ensemble Methods
- Kristin P. Bennett, Ayhan Demiriz, Richard Maclin
- Predicting Rare Classes: Can Boosting Make Any Weak Learner Strong?
- Mahesh V. Joshi, Ramesh C. Agarwal, Vipin Kumar
- Efficient Handling of High-Dimensional Feature Spaces by Randomized Classifier Ensembles
- Aleksander Kolcz, Xiaomei Sun, Jugal Kalita
Full Industrial/Application Papers:
- From Run-time Behavior to Usage Scenarios: An Interaction-pattern Mining Approach
- Mohammad El-Ramly, Eleni Stroulia, Paul Sorenson
- Exploiting Response Models - Optimizing Cross-Sell and Up-Sell Opportunities in Banking
- Andrew Storey, Marc-david Cohen
- Customer Lifetime Value Modeling and Its Use for Customer Retention Planning
- Saharon Rosset, Einat Neumann, Uri Eick, Nurit Vatnik, Yizhak Idan
- Mining Product Reputations on the Web
- Satoshi Morinaga, Kenji Yamanishi, Kenji Tateishi, Toshikazu Fukushima
- Learning Domain-Independent String Transformation Weights for High Accuracy Object Identification
- Sheila Tejada, Craig A. Knoblock, Steven Minton
- A System for Real-time Competitive Market Intelligence
- Sholom M. Weiss, Naval K. Verma
- Mining Intrusion Detection Alarms for Actionable Knowledge
- Klaus Julisch, Marc Dacier
- Learning Nonstationary Models of Normal Network Traffic for Detecting Novel Attacks
- Matthew V. Mahoney, Philip K. Chan
- ADMIT: Anomaly-based Data Mining for Intrusions
- Karlton Sequeira, Mohammed Zaki
- Handling Very Large Numbers of Association Rules in the Analysis of Microarray Data
- Alexander Tuzhilin, Gediminas Adomavicius
- On the potential of domain literature for clustering and for Bayesian network learning
- Peter Antal, Patrick Glenisson, Geert Fannes
- Mining Heterogeneous Gene Expression with Time Lagged Recurrent Neural Networks
- Yulan Liang, Arpad Kelemen
Poster Papers:
- Collaborative Crawling: Mining User Experiences for Topical Resource Discovery
- Charu C. Aggarwal
- Sequential PAttern Mining Using Bitmap Representation
- Jay Ayres, Jason Flannick, Johannes Gehrke, Tomi Yiu
- Frequent Term-Based Text Clustering
- Florian Beil, Martin Ester, Xiaowei Xu
- A Theoretical Framework for Learning from a Pool of Disparate Data Sources
- Shai Ben-David, Johannes Gehrke, Reba Schuller
- Topics in 0-1 data
- Ella Bingham, Heikki Mannila, Jouni K. Seppanen
- Extracting Decision Trees From Trained Neural Networks
- Olcay Boz
- A New Two-Phase Sampling Based Algorithm for Discovering Association Rules
- Bin Chen, Peter Haas, Peter Scheuermann
- CVS: A Correlation-Verification Based Smoothing Technique on Information Retrieval and Term Clustering
- Christina Yip Chung, Bin Chen
- Learning to Match and Cluster Large High-Dimensional Data Sets For Data Integration
- William W. Cohen, Jacob Richman
- SECRET: A Scalable Linear Regression Tree Algorithm
- Alin Dobra, Johannes Gehrke
- Statistical Modeling of Large-Scale Simulation Data
- Tina Eliassi-Rad, Terence Critchlow, Ghaleb Abdulla
- Tumor Cell Identification using Features Rules
- Bin Fang, Wynne Hsu, Mong Li Lee
- Integrating Feature and Instance Selection for Text Classification
- Dimitris Fragoudis, Dimitris Meretakis, Spiros Likothanassis
- SyMP: An Efficient Clustering Approach to Identify Clusters of Arbitrary Shapes in Large Data Sets
- Hichem Frigui
- Scaling multi-class Support Vector Machines using inter-class confusion
- Shantanu Godbole, Sunita Sarawagi, Soumen Chakrabarti
- Visualization Support for an User-Centered KDD Process
- TuBao Ho, TrongDung Nguyen, DungDuc Nguyen
- Mining Complex Models from Arbitrarily Large Databases in Constant Time
- Geoff Hulten, Pedro Domingos
- A Model for Discovering Customer Value for E-Content
- Srinivasan Jagannathan, Jayanth Nayak, Kevin Almeroth, Markus Hofmann
- SimRank: A Measure of Structural-Context Similarity
- Glen Jeh, Jennifer Widom
- Similarity Measure Based on Partial Information of Time series
- Xiaoming Jin, Yuchang Lu, Chunyi Shi
- Finding Surprising Patterns in a Time Series Database In Linear Time and Space
- Eamonn Keogh, Stefano Lonardi, Bill Yuan-chi Chiu
- Clustering Seasonality Patterns in the Presence of Errors
- Mahesh Kumar, Nitin R. Patel, Jonathan Woo
- Construct robust rule sets for classification
- Jiuyong Li, Rodney Topor, Hong Shen
- Instability of Decision Tree Classification Algorithms
- Ruey-Hsia Li, Geneva G. Belford
- Distributed Data Mining in a Chain Store Database of Short Transactions
- Cheng-Ru Lin, Chang-Hung Lee, Ming-Syan Chen, Philip S. Yu
- A Robust and Efficient Clustering Algorithm based on Cohesion Self-Merging
- Cheng-Ru Lin, Ming-Syan Chen
- Discovering Informative Content Blocks from Web Documents
- Shian-Hua Lin, Jan-Ming Ho
- Collusion in The U.S. Crop Insurance Program: Applied Data Mining
- Bert B. Little, Walter L. Johnston, Ashley C. Lovell, Roderick M. Rejesus, Steve A. Steed
- Incremental Context Mining for Adaptive Document Classification
- Rey-Long Liu, Yun-Ling Lu
- Evaluating Classifiers' Performance in a Constrained Environment
- Anna Olecka
- Discovering Word Senses from Text
- Patrick Pantel, Dekang Lin
- Combining Clustering and Co-training to Enhance Text Classification Using Unlabelled Data
- Bhavani Raskutti, Herman Ferra, Adam Kowalczyk,
- Single-shot Detection of Multiple Categories of Text using Parametric Mixture Models
- Naonori Ueda, Kazumi Saito
- What's the Code? Automatic Classification of Source Code Archives
- Secil Ugurel, Robert Krovetz, Lee Giles, David M. Pennock, Eric J. Glover, Hongyuan Zha
- Privacy Preserving Association Rule Mining in Vertically Partitioned Data
- Jaideep Vaidya, Chris Clifton
- Non-Linear Dimensionality Reduction Techniques for Classification and Visualization
- Michail Vlachos, Carlotta Domeniconi, Dimitris Gunopulos, George Kollios, Nick Koudas
- Item Selection By "Hub-Authority" Profit Ranking
- Ke Wang, Ming-Yen Thomas Su
- Discovery Net: Towards a Grid of Knowledge Discovery
- Vasa Curcin, Moustafa Ghanem, Yike Guo, Martin Kohler,Anthony Rowe, Jameel Syed,
Patrick Wendel
- Making every bit count: Fast nonlinear axis scaling
- Leejay Wu, Christos Faloutsos
- B-EM: A Classifier Incorporating Bootstrap with EM Approach for Data Mining
- Xintao Wu, Jianping Fan, Kalpathi R. Subramanian
- A Unifying Framework for Outlier Detection and Change Point Detection from Non-stationary Time Series Data
- Kenji Yamanishi, Jun-ichi Takeuchi
- CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data
- Yiling Yang, Xudong Guan, Jinyuan You
- Topic-conditioned Novelty Detection
- Yiming Yang, Jian Zhang, Jaime Carbonell, Chun Jin
- Transforming classifier scores into accurate multiclass probability estimates
- Bianca Zadrozny, Charles Elkan
|