|
|
KDD-2000 Sixth ACM SIGKDD International Conference on High Performance Data Mining
Vipin Kumar, Mohammed Zaki
Abstract: A fundamental problem in data mining is to develop
algorithms and systems which scale with increase in the amount of data, and
with increase in the data dimensions and complexity.� Due to the huge size of data and amount of
computation involved in mining algorithms, parallel and distributed
processing is often considered an essential component for a successful data
mining solution. The goal of
this tutorial is to provide researchers, practitioners, and advanced students
with an introduction to high performance data mining. The focus will be on
algorithms, software tools, and system architectures appropriate for mining
massive data sets using techniques from scalable, parallel and distributed
computing. The tutorial will provide 1) an overview of fundamental parallel and distributed data mining algorithms covering common techniques like classification, associations, sequences, clustering, etc.; 2) an introduction to some of the basic architectural frameworks for high performance data mining systems; and 3) an understanding of some of the outstanding algorithmic and systems issues while mining large data sets.� With this knowledge, the audience should be better prepared to mine larger data sets in practice or undertake research in this area. Biographies
of Organizers: Vipin Kumar is a Professor of Computer Science at the
University of Minnesota. His current research focuses on parallel computing
and data mining. His past research has produced highly efficient algorithms
and softwares such as Metis, hMetis, and PSPASES. He has authored over 100
research articles, and coedited or coauthored 5 books including the widely
used text book ``Introduction to Parallel Computing".� Kumar serves on the editorial boards of
several prominent journals in parallel computing.� He is a Fellow of IEEE and the Minnesota Supercomputer Institute,
and is a member of SIAM and ACM. Mohammed J. Zaki is an Assistant Professor of
Computer Science at Rensselaer Polytechnic Institute. His research interests
include the design of efficient, scalable, and parallel algorithms and
systems for various data mining tasks.�
He has published over 40 papers in this area, and he recently
co-edited the book, ``Large-scale Parallel Data Mining,'' LNAI
State-of-the-Art-Survey, Vol. 1759, Springer-Verlag, 2000. He was co-chair
for ACM SIGKDD workshop on Large-scale Parallel KDD Systems (1999), and is a
co-chair for IEEE IPDPS Workshop on High Performance Data Mining (2000). He
is a member of ACM and IEEE. |
|