Take home any knowledge you find.


All tutorials will be held in the Pacific Ballroom at the Newport Beach Marriott Hotel and Tennis Club.
More detail on the tutorials is available.

8:00am-6:00pm Tutorials
Pacific Ballroom
8:00-10:00am Tutorial 1
Data Mining and KDD: An Overview
Usama Fayyad, Microsoft Research and Evangelos Simoudis, IBM
10:00-10:30am Break
10:30am-12:30pm Tutorial 2
Modelling Data and Discovering Knowledge
David Hand, Open University, UK
Tutorial 3
Text Mining - Theory and Practice
Ronen Feldman, Bar-Ilan University, Israel
12:30-1:30pm Lunch
1:30-3:30pm Tutorial 4
Exploratory Data Analysis Using Interactive Dynamic Graphics
Deborah Swayne, Bell Communications Research and Diane Cook, Iowa State University
Tutorial 5
OLAP and Data Warehousing
Surajit Chaudhuri, Microsoft Research and Umesh Dayal, Hewlett Packard Laboratories
3:30-4:00pm Break
4:00-6:00pm Tutorial 6
Visual Techniques for Exploring Databases
Daniel Keim, University of Munich, Germany
Tutorial 7
Statistical Models for Categorical Response Data
William DuMouchel, AT&T Research
6:00-6:30pm Break
6:30-10:00pm Opening Reception
California Ballroom
6:30-10:00pm Opening Reception


All KDD-97 sessions are plenary sessions and will be held in the Pacific Ballroom at the Newport Beach Marriott Hotel and Tennis Club.
A listing of the posters is available.
Demonstrations and exhibits are being held today, from 12:30 pm to 5:00 pm.
8:30-10:00am Keynote Address and Invited Talk Session
Pacific Ballroom
8:30-8:45am Welcome and Introduction
Ramasamy Uthurusamy (KDD-97 General Conference Chair), David Heckerman, Heikki Mannila, and Daryl Pregibon (KDD-97 Program Cochairs)
8:45-9:30am Keynote Address: From Large to Huge: A Statistician's Reactions to KDD & DM
Peter J. Huber, University of Bayreuth, Germany
The statistics and AI communities are confronted by the same challenge, the onslaught of ever larger data collections, but the two communities have reacted independently and differently. What could they learn from each other if they looked over the fence? What is amiss on either side?
9:30-10:00am Invited Talk: Machine Learning and KDD: New Developments
Thomas Dietterich, University of Oregon
Machine learning research is pursuing many directions relevant to KDD. This talk will review two exciting lines of research: learning with ensembles (committees, bagging, boosting, etc.) and learning with stochastic models. I will also briefly mention current research in reinforcement learning and methods for scaling up machine learning algorithms.
10:00-10:30am Coffee Break
10:30-12:15pm Keynote Address and Poster Previews Session
Pacific Ballroom and California Ballroom, respectively
10:30-11:15am Keynote Address: Searching for Causes and Predicting Interventions
Clark Glymour, University of California at San Diego and Carnegi Mellon University
Pacific Ballroom
The problem in many data mining tasks is to predict features of a new, unobserved sample from a recorded sample, assuming the sampling distribution does not change. In causal data mining the task is to predict the features of new unobserved samples from a recorded sample, assuming any unobserved sample results from an intervention which directly alters the underlying probability distribution of some variables in a known way, and alters others indirectly through the influences of the direclty manipulated variables. Data mining in the sciences and for business, military, of public policy is often concerned with discovering causal structure, although that focus is sometimes obscured by the belief that causation is especially mysterious. Causation is not probability, but it is no more (and no less) mysterious.
This lecture will illustrate why data mining for causes using techniques designed for recognition or classification is unwise, and will illustrate principles of causal data mining with several interesting cases that use both Bayesian and constraint-based methods.
11:15am-12:15pm Poster Previews (Authors A-J) Session
California Ballroom
12:15-2:00pm Lunch
Program Committee Luncheon
Room Newport North, lobby level
2:00-3:00pm Poster Preview Session
California Ballroom
2:00-3:00pm Poster Previews (Authors K-Z) Session
3:00-7:00pm Exhibits, Demos, and Poster Sessions
California Ballroom
(Coffee and soft drinks will be available in the foyer from 3:00-7:00pm.)
3:00-4:20pm Poster Session 1 (Authors A-G)
4:20-5:40pm Poster Session 2 (Authors H-O)
5:40-7:00pm Poster Session 3 (Authors P-Z)



8:30-9:45am Paper Session: Database Methodology
Pacific Ballroom
8:30-8:55am Computing Optimized Rectilinear Regions for Association Rules
Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, and Takeshi Tokuyama, IBM Tokyo Research Laboratory, Japan
8:55-9:20am Density-Connected Sets and their Application for Trend Detection in Spatial Databases
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu, University of Munich, Germany
9:20-9:45am Mining Association Rules with Item Constraints
Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal, IBM Almaden Research Center
9:45-10:15am Coffee Break
10:15am-12:30pm Keynote Address/Invited Talks/Panel Session
Pacific Ballroom
10:15-11:00am Keynote Address: Are Spatial Data Special: A Data Mining Perspective?
Raymond Ng, University of British Columbia, Canada
From a data mining standpoint, I will discuss in this talk how spatial data are similar to and different from conventional alphanumeric data. I will use characteristic and discriminant rules as examples, and discuss how the characteristics of spatial data affect the ways these rules can be found. Finally, I will comment on the generalization from spatial data to discuss whether temporal data or image data are special.
11:00-11:30am Invited Talk: Highlights from the Joint Statistical Meetings
Jim Hodges, University of Minnesota
Statisticians and data-miners have a lot in common but their paths tend not to cross. Thus, for example, many statisticians are missing out on computing and other ideas familiar to data-miners, and data-miners sometimes re-invent things that are well-known to statisticians. This talk is intended to toss a small bridge across the gap, by summarizing some talks and trends from the 1997 Joint Statistical Meetings, held in Anaheim, California just before KDD-97.
11:30-12:30pm Panel Discussion: Validity of Data Mining Results
Moderator: David Hand, The Open University, United Kingdom
Panelists: Jim Hodges, University of Minnesota; Reza Nakhaeizadeh, Daimler-Benz AG, Germany; Gregory Piatetsky-Shapiro, Knowledge Stream Partners; Arno Siebes, CWI, The Netherlands; and Padhraic Smyth, University of California, Irvine
Data mining methods typically rely on large-scale search for patterns or models. How can statistically valid results be ensured in this search-intensive approach? What kind of statistical tests are appropriate? How can one reduce overfitting? What techniques exist for efficient and data-adaptive sampling strategies for specific classes of algorithms? How can one combine statistical and non-statistical measures of the utility of discovered patterns? These and other questions from the audience will be discussed by the panelists.
12:30-2:00pm Lunch
2:00-3:30pm Paper Session: Applications
Pacific Ballroom
2:00-2:30pm Detecting Atmospheric Regimes Using Cross-Validated Clustering
Padhraic Smyth, University of California, Irvine; Michael Ghil and Kayo Ide, University of California, Los Angeles; Joe Roden, Jet Propulsion Laboratory, California Institute of Technology; Andrew Fraser, Portland State University
2:30-3:00pm Automated Discovery of Active Motifs in Three Dimensional Molecules
Xiong Wang and Jason T.L. Wang, New Jersey Institute of Technology; Dennis Shasha, New York University; Bruce Shapiro, National Cancer Institute; Sitaram Dikshitulu, New Jersey Institute of Technology; Isidore Rigoutsos, IBM T. J. Watson Research Center; Kaizhong Zhang, University of Western Ontario, Canada
3:00-3:30pm JAM: Java Agents for Meta-Learning over Distributed Databases
Salvatore Stolfo, Andreas L. Prodromidis, Shelley Tselepis, Wenke Lee, and Dave W. Fan, Columbia University; Philip K. Chan, Florida Institute of Technology
3:30-4:00pm Coffee Break
4:00-6:00pm Invited Talk/Paper Presentation/Awards Presentation Session
Pacific Ballroom
4:00-4:30pm Invited Talk: Highlights from the Graphical Modeling Workshop
David Madigan, University of Washington
This talk will provide a very brief tutorial introduction to basic concepts of graphical models. A survey of current research topics will follow, highlighting in particular the work presented at the June 1997 AMS-IMS-SIAM workshop on graphical models.
4:30-5:00pm Knowledge = Concepts: A Harmful Equation
Jan M. Zytkow, Wichita State University and Polish Academy of Sciences, Poland
5:00-6:00pm Awards Presentation


The Workshop on Issues in the Integration of Data Mining and Data Visualization is also being held today.

8:30-9:45am Paper Session: Methodology I
Pacific Ballroom
8:30-8:55am A Probabilistic Approach to Fast Pattern Matching in Time Series Databases
Eamonn Keogh and Padhraic Smyth, University of California, Irvine
8:55-9:20am Discriminative vs Informative Learning
Y. Dan Rubinstein and Trevor Hastie, Stanford University
9:20-9:45am Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions
Foster Provost and Tom Fawcett, NYNEX Science and Technology
9:45-10:15am Coffee Break
10:15am-12:30pm Paper Session: Methodology II/Invited Panel
Pacific Ballroom
10:15-10:40am Development of Multi-Criteria Metrics for Evaluation of Data Mining Algorithms
Gholamreza Nakhaeizadeh, Daimler-Benz AG, Germany and Alexander Schnabl, Technical University Vienna, Austria
10:40-11:05am A Visual Interactive Framework for Attribute Discretization
Ramesh Subramonian, Ramana Venkata, and Joyce Chen, Intel Corporation
11:05-11:30am Using General Impressions to Analyze Discovered Classification Rules
Bing Liu, Wynne Hsu, and Shu Chen, National University of Singapore, Singapore
11:30am-12:30pm Panel Discussion: Data Mining and the Web
Moderator: Usama Fayyad, Microsoft Research
Panelists: Susan Dumais, Bellcore, NJ; Ronen Feldman, Bar-Ilan University, Israel; Marti Hearst, Xerox Parc; Haym Hirsh, Rutgers University; Kamran Parsaye; Michael Pazzani, University of California, Irvine; and Evangelos Simoudis, IBM, Almaden.
The Web is becoming the premium source of information for a growing number of people, and Web-based systems are increasingly applied for various applications, whether for commercial transactions or for a group organizing and performing its collaboration. Two challenges are predominant for data mining on the Web. The first goal is to help users in finding useful information on the Web and in discovering knowledge about a domain that is represented by a collection of Web-documents. The second goal is to analyse the transactions run in a Web-based system, be it to optimize the system or to find information about the clients using the system. This panel will discuss the approaches, chances, and problems of applying data mining techniques for the Web.
12:30-2:00pm Lunch
2:00-3:15pm Paper Session: Tools/Environments for DM
Pacific Ballroom
2:00-2:25pm An Interactive Visualization Environment for Data Exploration
Mark Derthick, John Kolojejchick, and Steven F. Roth, Carnegie Mellon University
2:25-2:50pm Visualization Techniques to Explore Data Mining Results for Document Collections
Ronen Feldman, Bar-Ilan University, Israel; Willi Klösgen, German National Research Center for Information Technology, Germany; Amir Zilberstein, Bar-Ilan University, Israel
2:50-3:15pm Anytime Exploratory Data Analysis for Massive Data Sets
Padhraic Smyth, University of California, Irvine and David Wolpert, IBM Almaden Research Center
3:15-4:00pm Open Forum
3:15-4:00pm Open Forum: Comments on KDD-97

home | top