|
|
|

|
|

Take home any knowledge you find.
|


All tutorials will be held in the Pacific Ballroom at the
Newport Beach Marriott Hotel and Tennis Club.
More detail on the
is available.

|
|
8:00am-6:00pm
|
Tutorials
Pacific Ballroom
|
|
8:00-10:00am
|
Tutorial 1
Data Mining and KDD: An Overview
Usama Fayyad, Microsoft Research and Evangelos Simoudis, IBM
|
|
10:00-10:30am
|
Break
|
|
10:30am-12:30pm
|
Tutorial 2
Modelling Data and Discovering Knowledge
David Hand, Open University, UK
|
Tutorial 3
Text Mining - Theory and Practice
Ronen Feldman, Bar-Ilan University, Israel
|
|
12:30-1:30pm
|
Lunch
|
|
1:30-3:30pm
|
Tutorial 4
Exploratory Data Analysis Using Interactive Dynamic Graphics
Deborah Swayne, Bell Communications Research and Diane Cook, Iowa State
University
|
Tutorial 5
OLAP and Data Warehousing
Surajit Chaudhuri, Microsoft Research and Umesh Dayal, Hewlett Packard
Laboratories
|
|
3:30-4:00pm
|
Break
|
|
4:00-6:00pm
|
Tutorial 6
Visual Techniques for Exploring Databases
Daniel Keim, University of Munich, Germany
|
Tutorial 7
Statistical Models for Categorical Response Data
William DuMouchel, AT&T Research
|
|
6:00-6:30pm
|
Break
|
|
6:30-10:00pm
|
Opening Reception
California Ballroom
|
|
6:30-10:00pm
|
Opening Reception
|
|
8:30-10:00am
|
Keynote Address and Invited Talk Session
Pacific Ballroom
|
|
8:30-8:45am
|
Welcome and Introduction
Ramasamy Uthurusamy (KDD-97 General Conference Chair),
David Heckerman, Heikki Mannila,
and Daryl Pregibon (KDD-97 Program Cochairs)
|
|
8:45-9:30am
|
Keynote Address: From Large to Huge: A Statistician's
Reactions to KDD & DM
Peter J. Huber, University of Bayreuth, Germany
The statistics and AI communities are confronted by the same challenge,
the onslaught of ever larger data
collections, but the two communities have reacted independently and
differently. What could they learn from each
other if they looked over the fence? What is amiss on either side?
|
|
9:30-10:00am
|
Invited Talk: Machine Learning and KDD: New Developments
Thomas Dietterich, University of Oregon
Machine learning research is pursuing many directions relevant to KDD.
This talk will
review two exciting lines of research: learning with ensembles
(committees, bagging,
boosting, etc.) and learning with stochastic models. I will also
briefly mention current
research in reinforcement learning and methods for scaling up machine
learning
algorithms.
|
|
10:00-10:30am
|
Coffee Break
|
|
10:30-12:15pm
|
Keynote Address and Poster Previews Session
Pacific Ballroom and California Ballroom, respectively
|
|
10:30-11:15am
|
Keynote Address: Searching for Causes and Predicting Interventions
Clark Glymour, University of California at San Diego and Carnegi Mellon
University
Pacific Ballroom
The problem in many data mining tasks is to predict features of a new,
unobserved sample from a recorded sample, assuming the sampling distribution
does not change. In causal data mining the task is to predict the features
of new unobserved samples from a recorded sample, assuming any unobserved
sample results from an intervention which directly alters the underlying
probability distribution of some variables in a known way, and alters others
indirectly through the influences of the direclty manipulated variables.
Data mining in the sciences and for business, military, of public policy is
often concerned with discovering causal structure, although that focus is
sometimes obscured by the belief that causation is especially mysterious.
Causation is not probability, but it is no more (and no less) mysterious.
This lecture will illustrate why data mining for causes using techniques
designed for recognition or classification is unwise, and will illustrate
principles of causal data mining with several interesting cases that use both
Bayesian and constraint-based methods.
|
|
11:15am-12:15pm
|
Poster Previews (Authors A-J) Session
California Ballroom
|
|
12:15-2:00pm
|
Lunch
|
Program Committee Luncheon
Room Newport North, lobby level
|
|
2:00-3:00pm
|
Poster Preview Session
California Ballroom
|
|
2:00-3:00pm
|
Poster Previews (Authors K-Z) Session
|
|
3:00-7:00pm
|
Exhibits, Demos, and Poster Sessions
California Ballroom
(Coffee and soft drinks will be available in the foyer from 3:00-7:00pm.)
|
|
3:00-4:20pm
|
Poster Session 1 (Authors A-G)
|
|
4:20-5:40pm
|
Poster Session 2 (Authors H-O)
|
|
5:40-7:00pm
|
Poster Session 3 (Authors P-Z)
|
|
8:30-9:45am
|
Paper Session: Database Methodology
Pacific Ballroom
|
|
8:30-8:55am
|
Computing Optimized Rectilinear Regions for Association Rules
Kunikazu Yoda, Takeshi Fukuda, Yasuhiko Morimoto, Shinichi
Morishita, and Takeshi Tokuyama, IBM Tokyo Research
Laboratory, Japan
|
|
8:55-9:20am |
Density-Connected Sets and their Application for Trend
Detection in Spatial Databases
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei
Xu, University of Munich, Germany
|
|
9:20-9:45am |
Mining Association Rules with Item Constraints
Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal, IBM
Almaden Research Center
|
|
9:45-10:15am |
Coffee Break
|
|
10:15am-12:30pm |
Keynote Address/Invited Talks/Panel Session
Pacific Ballroom
|
|
10:15-11:00am |
Keynote Address: Are Spatial Data Special: A Data Mining
Perspective?
Raymond Ng, University of British Columbia, Canada
From a data mining standpoint, I will discuss in this talk how spatial data
are similar to
and different from conventional alphanumeric data. I will use
characteristic and
discriminant rules as examples, and discuss how the characteristics of
spatial data
affect the ways these rules can be found. Finally, I will comment on the
generalization
from spatial data to discuss whether temporal data or image data are
special.
|
|
11:00-11:30am |
Invited Talk: Highlights from the Joint Statistical Meetings
Jim Hodges, University of Minnesota
Statisticians and data-miners have a lot in common but their paths tend not to
cross.
Thus, for example, many statisticians are missing out on computing and
other ideas
familiar to data-miners, and data-miners sometimes re-invent things that
are
well-known to statisticians. This talk is intended to toss a small bridge
across the
gap, by summarizing some talks and trends from the 1997 Joint Statistical
Meetings,
held in Anaheim, California just before KDD-97.
|
|
11:30-12:30pm |
Panel Discussion: Validity of Data Mining Results
Moderator: David Hand, The Open University, United Kingdom
Panelists: Jim Hodges, University of Minnesota; Reza
Nakhaeizadeh, Daimler-Benz AG, Germany; Gregory
Piatetsky-Shapiro, Knowledge Stream Partners; Arno Siebes,
CWI, The Netherlands;
and Padhraic Smyth, University of California, Irvine
Data mining methods typically rely on large-scale search for patterns or
models. How can statistically valid results be ensured in this
search-intensive approach? What kind of statistical tests are appropriate?
How can one reduce overfitting? What techniques exist for efficient and
data-adaptive sampling strategies for specific classes of algorithms? How
can one combine statistical and non-statistical measures of the utility of
discovered patterns? These and other questions from the audience will be
discussed by the panelists.
|
|
12:30-2:00pm |
Lunch
|
|
2:00-3:30pm |
Paper Session: Applications
Pacific Ballroom
|
|
2:00-2:30pm |
Detecting Atmospheric Regimes Using Cross-Validated Clustering
Padhraic Smyth, University of California, Irvine; Michael
Ghil and Kayo Ide, University of California, Los Angeles;
Joe Roden, Jet Propulsion Laboratory, California Institute
of Technology; Andrew Fraser, Portland State University
|
|
2:30-3:00pm |
Automated Discovery of Active Motifs in Three Dimensional
Molecules
Xiong Wang and Jason T.L. Wang, New Jersey Institute of
Technology; Dennis Shasha, New York University;
Bruce Shapiro, National Cancer Institute; Sitaram
Dikshitulu, New Jersey Institute of Technology;
Isidore Rigoutsos, IBM T. J. Watson Research Center;
Kaizhong Zhang, University of Western Ontario, Canada
|
|
3:00-3:30pm |
JAM: Java Agents for Meta-Learning over Distributed Databases
Salvatore Stolfo, Andreas L. Prodromidis, Shelley Tselepis,
Wenke Lee, and Dave W. Fan, Columbia University; Philip K. Chan,
Florida Institute of Technology
|
|
3:30-4:00pm |
Coffee Break
|
|
4:00-6:00pm |
Invited Talk/Paper Presentation/Awards Presentation Session
Pacific Ballroom
|
|
4:00-4:30pm |
Invited Talk: Highlights from the Graphical Modeling Workshop
David Madigan, University of Washington
This talk will provide a very brief tutorial introduction to basic
concepts of graphical
models. A survey of current research topics will follow, highlighting
in particular the
work presented at the June 1997 AMS-IMS-SIAM workshop on graphical
models.
|
|
4:30-5:00pm |
Knowledge = Concepts: A Harmful Equation
Jan M. Zytkow, Wichita State University and Polish Academy
of Sciences, Poland
|
|
5:00-6:00pm |
Awards Presentation
|
|
8:30-9:45am
|
Paper Session: Methodology I
Pacific Ballroom
|
|
8:30-8:55am |
A Probabilistic Approach to Fast Pattern Matching in Time
Series Databases
Eamonn Keogh and Padhraic Smyth, University of California,
Irvine
|
|
8:55-9:20am |
Discriminative vs Informative Learning
Y. Dan Rubinstein and Trevor Hastie, Stanford University
|
|
9:20-9:45am |
Analysis and Visualization of Classifier Performance:
Comparison under Imprecise Class and Cost Distributions
Foster Provost and Tom Fawcett, NYNEX Science and Technology
|
|
9:45-10:15am |
Coffee Break
|
|
10:15am-12:30pm |
Paper Session: Methodology II/Invited Panel
Pacific Ballroom
|
|
10:15-10:40am |
Development of Multi-Criteria Metrics for Evaluation of
Data Mining Algorithms
Gholamreza Nakhaeizadeh, Daimler-Benz AG, Germany and
Alexander Schnabl, Technical University Vienna, Austria
|
|
10:40-11:05am |
A Visual Interactive Framework for Attribute Discretization
Ramesh Subramonian, Ramana Venkata, and Joyce Chen, Intel
Corporation
|
|
11:05-11:30am |
Using General Impressions to Analyze Discovered
Classification Rules
Bing Liu, Wynne Hsu, and Shu Chen, National University of
Singapore, Singapore
|
|
11:30am-12:30pm |
Panel Discussion: Data Mining and the Web
Moderator: Usama Fayyad, Microsoft Research
Panelists: Susan Dumais, Bellcore, NJ;
Ronen Feldman, Bar-Ilan University, Israel; Marti Hearst, Xerox Parc; Haym
Hirsh, Rutgers University; Kamran Parsaye; Michael Pazzani,
University of California, Irvine; and Evangelos Simoudis,
IBM, Almaden.
The Web is becoming the premium source of information for a growing number
of people, and Web-based systems are increasingly applied for various
applications, whether for commercial transactions or for a group organizing
and performing its collaboration. Two challenges are predominant for data
mining on the Web. The first goal is to help users in finding useful
information on the Web and in discovering knowledge about a domain that is
represented by a collection of Web-documents. The second goal is to analyse
the transactions run in a Web-based system, be it to optimize the system or
to find information about the clients using the system. This panel will
discuss the approaches, chances, and problems of applying data mining
techniques for the Web.
|
|
12:30-2:00pm |
Lunch
|
|
2:00-3:15pm |
Paper Session: Tools/Environments for DM
Pacific Ballroom
|
|
2:00-2:25pm |
An Interactive Visualization Environment for Data Exploration
Mark Derthick, John Kolojejchick, and Steven F. Roth,
Carnegie Mellon University
|
|
2:25-2:50pm |
Visualization Techniques to Explore Data Mining Results for
Document Collections
Ronen Feldman, Bar-Ilan University, Israel; Willi Klösgen,
German National Research Center for Information Technology,
Germany; Amir Zilberstein, Bar-Ilan University,
Israel
|
|
2:50-3:15pm |
Anytime Exploratory Data Analysis for Massive Data Sets
Padhraic Smyth, University of California, Irvine and David
Wolpert, IBM Almaden Research Center
|
|
3:15-4:00pm |
Open Forum
|
|
3:15-4:00pm |
Open Forum: Comments on KDD-97
|

|
|