KDD 2007 Conference - Workshop Papers Submission Information

Workshops

KDD-2007 LIST OF ACCEPTED WORKSHOPS

The following workshops have been accepted for KDD 2007. A description of each workshop and submission details can be found by clicking on each workshop link.

Full Day Workshops

Data Mining and Audience Intelligence for Advertising (ADKDD'07)
Data Mining in Bioinformatics (BIOKDD'07)
KDD Cup and Workshop 2007
Knowledge Discovery from Sensor Data
Privacy, Security, and Trust in KDD (PinKDD'07)
Web Mining and Social Network Analysis (WebKDD/SNA-KDD '07)

Half Day Workshops (Morning)

Data Mining Case Studies and Data Mining Practice Prize
Multimedia Data Mining
Mining Multiple Information Sources
Workshop and Challenge on Time Series Classification

Half Day Workshops (Afternoon)

Data Mining Standards, Services and Platforms
Domain Driven Data Mining (DDDM2007)

The First International Workshop on Data Mining and Audience Intelligence for Advertising (ADKDD'07)

Advertising is a half-a-trillion dollar business, out of which online advertising is a small but rapidly growing part. The dramatic growth in the number of participants in online advertising marketplace (users, advertisers, publishers and market makers) brought about large volumes of data and exciting data mining problems. A successful advertising system should benefit end users, advertisers and market makers through solutions to a wide range of research topics from basic user profiling and ultimately user intent understanding. Research output from traditional advertising channels such as television, radio and print will also aid the advancement of online advertising.

The goal of this workshop is to encourage data mining and knowledge discovery researchers to take on the numerous challenges faced by the rapid changing advertising industry. It aims to serve as a forum for researchers and industry practitioners to exchange latest research results and best practices and to encourage future breakthroughs that may contribute back to the general data mining research. The workshop will feature invited talks from noted experts in the subject areas as well as contributed talks from researchers on the latest data mining research for advertising. We encourage papers that propose novel data mining techniques in (but not restricted to) the following areas:

Auction theory and design for advertising
Search intent discovery for advertising
Audience Intelligence
Opinion/sentiment mining
Mining social networks and blogs
Behavioral targeting
Analysis for content-targeted advertising
Multimedia online advertisement
Spam detection in online advertisements
Techniques used for analysis: e.g. Text mining, including named entity extraction, query classification, keyword extraction, and other topics
Web scale information extraction for online advertisement
Consumer privacy and data use policy; privacy preserving data mining approaches
Tracking effectiveness of advertisement campaigns

The URL for the workshop is http://adlab.microsoft.com/adkdd2007/.

Please contact Arun Surendran at acsuren@microsoft.com if you have any further questions or if you are unable to access the website.

7th International Workshop on Data Mining in Bioinformatics (BIOKDD07)

The 7th International Workshop on Data Mining in Bioinformatics (BIOKDD '07) is a premier forum for Academic and industrial researchers to disseminate the latest knowledge discovery and data mining (KDD) theories and practices relevant to bioinformatics--the computational science of managing, mining, and interpreting biological information. Post-genome advances in microarrays and mass spectrometry continued to provide bioinformatics researchers with a steady influx of functional genomics and proteomics data sets. The gap between biological data collection and knowledge curation also provided good motivations for KDD researchers to develop and apply novel computational techniques, tools, and systems in bioinformatics.

BIOKDD '07 will be modeled on our past successful organization of BIOKDD workshops (2001-2006), held in conjunction with previous annual ACM SIGKDD conferences as a full-day workshop. The goal of the workshop is to encourage bioinformatics researchers take on the numerous challenges that bioinformatics offers from the KDD perspective. While tremendous progress has been made over the years, many of the fundamental problems in bioinformatics, such as protein structure-function elucidation, gene-gene and gene-environment interactions, and signaling pathway mappings, are still open. We encourage submissions of original research papers that describe novel biology domain-specific KDD methods and application program software that results in significant understanding of current biology.

For further details on workshop paper submission deadlines, please visit http://bio.informatics.iupui.edu/biokdd07/.

KDD Cup and Workshop 2007

Co-organized by ACM SIGKDD and Netflix

KDD Cup is the first and the oldest data mining competition, and is an integral part of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). This year's KDD Cup competition tasks are related to (but different from) the current Netflix Prize competition. There will also be a workshop at the KDD-2007 conference, where the participants of both the KDD Cup and the current Netflix Prize competition will present their papers and exchange ideas. We are looking forward to an interesting competition and your participation.

There are 2 parallel options for participating:

The KDD Cup competition (open to all)
Workshop paper submissions (open to Netflix prize participants only).

Full details are available at the KDD Cup and Workshop website:

Co-Chairs

Jim Bennett, Neflix, USA
Charles Elkan, University of California, San Diego, USA
Bing Liu (Chair), University of Illinois at Chicago, USA
Padhraic Smyth, University of California, Irvine, USA
Domonkos Tikk, Budapest University of Technology and Economics, Hungary

http://www.cs.uic.edu/Netflix-KDD-Cup-2007

ACM Workshop on Knowledge Discovery from Sensor Data

Description

Wide-area sensor infrastructures, remote sensors, and wireless sensor networks, yield massive volumes of disparate, dynamic, and geographically distributed data. As such sensors are becoming ubiquitous, a set of broad requirements is beginning to emerge across high-priority applications including disaster preparedness and management, adaptability to climate change, national or homeland security, and the management of critical infrastructures. The raw data from sensors need to be efficiently managed and transformed to usable information through data fusion, which in turn must be converted to predictive insights via knowledge discovery, ultimately facilitating automated or human-induced tactical decisions or strategic policy based on decision sciences and decision support systems. The challenges for the Knowledge Discovery community are expected to be immense. On the one hand, dynamic data streams or events require real-time analysis methodologies and systems, while on the other hand centralized processing through high end computing is required for generating offline predictive insights, which in turn can facilitate real-time analysis. In addition, emerging societal problems require knowledge discovery solutions that are designed to investigate anomalies, change, extremes and nonlinear processes, and departures from the normal.

This workshop seeks to bring together researchers from academia, government and the private sector who can contribute to the following broad areas:

A. Data Mining Techniques

Sensor data preprocessing, transformation, and sampling techniques
Scalable and distributed classification, prediction, and clustering algorithms

B. Offline Knowledge Discovery

Predictive analysis from geographically distributed heterogeneous data
Computationally efficient approaches for mining unusual patterns from massive and disparate spatio-temporal data

C. Online Knowledge Discovery

Real-time analysis of dynamic and distributed data
Mining continuous streams and ubiquitous data
Resource-aware algorithms for distributed mining
Real-time event detection and alarm generation algorithms

D. Decision and Policy Aids

Coordinated offline discovery and online analysis with feedback loops
Combination of knowledge discovery and decision scientific processes
Facilitation of faster and reliable tactical and strategic decisions

E. Case Studies

Success stories in national or global priority applications
Real-world problem design and knowledge discovery requirements

Sponsors:

Computational Sciences and Engineering Division (http://computing.ornl.gov/cse_home/) , Oak Ridge National Laboratory (http://www.ornl.gov/)
European Project KDUbiq-WG3 (http://www.kdubiq.org/) , Information Society Technology, European Union

Workshop URL:

http://www.ornl.gov/sci/knowledgediscovery/KDD-2007-Workshop/index.htm

The First ACM SIGKDD International Workshop on Privacy, Security, and Trust in KDD (PinKDD'07)

Abstract:

Given the vast amount of data that is collected by service providers, system administrators, and is available in public information systems, data mining provides an ideal framework to assist in computer security and surveillance-related endeavors. However, the application of data mining to person-specific data raises serious concerns regarding data confidentiality and citizens' privacy rights. The problems are global and many governments are struggling to set national, and international, policies on privacy and security for data mining endeavors. Ensuring privacy and security, as well as establishing trust are essential for the provision of electronic and knowledge-based services in modern e-business, e-commerce, e-government, and e-health environments. Discussion often focuses on ethical and policy aspects of the problem, neglecting technology, and the resolution is usually polarized; e.g. an organization can either 1) share databases of personal information for data mining purposes or 2) it can not. Fortunately, computer scientists, and data mining researchers in particular, have recognized that technology can be constructed to support flexible solutions to enable data mining goals without sacrificing the privacy and security of the individuals to whom the data corresponds. To inject privacy into security and surveillance data mining projects, it is necessary to understand the goals of the latter. It is the goal of this workshop to cross-fertilize research on privacy, security, and how to resolve trust issues within a data mining framework that addresses technical and social viewpoints. We hope to attract interest from a wide range of data mining subareas, including: web mining, biomedical data mining, spatio-temporal data mining, ubiquitous knowledge discovery, and privacy-preserving data mining. By supporting the development of privacy-aware data mining technology, this workshop will enable a wider social acceptance of a multitude of new services and applications based on the knowledge discovery process.

Link: http://www-kdd.isti.cnr.it/pinkdd07/

Joint 9th WebKDD and 1st SNA-KDD Workshop on Web Mining and Social Network Analysis 2007 (WebKDD/SNA-KDD '07)

In recent years, social network research has advanced significantly, thanks to the prevalence of the online social websites and instant messaging systems. People perceive the Web increasingly as a social medium that fosters interaction among people, sharing of experiences and knowledge, group activities, community formation and evolution. The social flair of the Web poses new challenges for the data mining community. The joint 9th WEBKDD and 1st SNA-KDD workshop 2007 on Web Mining and Social Network Analysis aims to bring together practitioners and researchers with a specific focus on the emerging trends and industry needs associated with the traditional Web, the social Web, and other forms of social networking systems. The workshop solicits experimental and theoretical work on Web mining and social network analysis, including (1) data mining advances on the discovery and analysis of communities, on personalization for solitary activities (like search) and social activities (like discovery of potential friends), on the analysis of user behavior in open fora (like conventional sites, blogs and fora) and in commercial platforms (like e-auctions) and on the associated security and privacy-preservation challenges; (2) social network modeling, scalable, customizable social network infrastructure construction , temporal analysis on social networks topologies, contextual social network analysis, large-scale graph algorithms, dynamic growth and evolution patterns identification and discovery using machine learning approaches or multi-agent based simulation.

In addition to paper presentations and depending on time limitations, we will solicit an invited talk or a panel that will stress the interdisciplinary challenges of the social Web.

Publication of the Workshop Proceedings: The workshop notes will be published on hard copy and distributed during the workshop. The full extended version of the accepted papers will be published, pending approval, by Springer-Verlag after a second round of reviews.

For more information, please visit:

http://workshops.socialnetworkanalysis.info/websnakdd2007/

Second Workshop on Data Mining Case Studies and Data Mining Practice Prize

The Data Mining Case Studies Workshop was established to showcase the very best in data mining case deployments. We continue Data Mining Case Studies into its second workshop, to be held at KDD2007. Data Mining Case Studies will highlight data mining implementations that have been responsible for a significant and measurable improvement in business operations, or an equally important scientific discovery, or some other benefit to humanity.

Examples of Data Mining Case Studies from previous years have included: (a) a medical application that has save hundreds of lives by mining through hundreds of thousands of patient records to identify patients who have show all the signs for heart disease, yet have not been prescribed heart medication, (b) a system which has uncovered hundreds of millions in sheltered tax evasion rings, (c) a system which has raised revenue by improved cross-selling of computer peripherals and equipment.

Data Mining Case Studies will allow papers greater latitude in (a) range of topics, (b) page length (c) scope, (d) allowance for prior publication, (e) novelty. Unsuccessful data mining systems that describe lessons learned and "war stories" will also be assessed.

The Data Mining Practice Prize (sponsored by Microsoft and Elder Research Inc) will be awarded for the best Data Mining Case Study submission. The prize will be awarded for work that has had a significant and quantitative impact in the application in which it was applied, or has significantly benefited humanity.

Data Mining Case Studies Program Website: http://www.dataminingcasestudies.com/DMCS2007_Program.doc

Data Mining Standards, Services and Platforms

The DM-SSP Workshops provide a forum for those who are system developers, application developers, or integrators of data mining platforms and services. This year we will focus on emerging architectures for data mining platforms and services, with a particular emphasis on the following topics:

interoperability of data mining systems and services
distributed platforms for data mining
web service-based architectures for data mining
grid service-based architectures for data mining
real time data mining architectures
integration of data mining and workflow systems

This year's Workshop is half a day and will consist of a mixture of invited and contributed papers on these topics.

Developers of data mining systems, services and applications; researchers interested in data mining systems, services; and, integrators and others who wish to deploy data mining systems in operational environments are all encouraged to participate.

This is the fourth year that the DM-SSP Workshop has been associated with the KDD conference.

Please submit contributed papers to dmsssp07 at opendatagroup.com by May 31, 2007. Papers should be 10 pages or less in length and formatted following the same guide lines that are used by the KDD 2007 Conference.

The workshop's web site is: http://www.opendatagroup.com/dmssp07.

KDD Workshop on Domain Driven Data Mining (DDDM2007)

1 Abstract

Existing data mining methodology and methods are preliminarily data and academia oriented. As continuously pointed out by KDD panelists, it is a great challenge to mine real-world domain problems and discover actionable knowledge satisfying business needs. Domain-driven data mining targets the development of methodology and approaches tackling such challenges. Domain driven data mining generally involves the following factors targeting actionable knowledge discovery in complex domain problems: (i) utilizing and mining domain expert intelligence, human involvement, domain intelligence, in-depth data intelligence, process, environment and social intelligence, web intelligence; (ii) conducting meta-synthesis of the above intelligence for actionable knowledge discovery; (iii) developing and enhancing knowledge actionability and reliability, and seamless migration and methodology/system/interaction support into business. The 2007 International Workshop on Domain Driven Data Mining (DDDM2007) aims to provide a premier forum for sharing advanced innovation and smart information use in developing domain driven data mining methodologies, techniques, and case studies, as well as insights on trends and controversies. DDDM2007 welcomes theoretical and applied disseminations that make efforts (1) to expose next-generation data mining methodology for actionable knowledge discovery, identifying how KDD techniques can better contribute to critical domain problems in theory and practice; (2) to uncover domain-driven data mining techniques identifying how KDD can better strengthen business intelligence in complex enterprise applications; (3) to disclose the applications of domain-driven data mining identifying how KDD can be effectively deployed into solving complex practical problems; and (4) to identify challenges and directions for future research and development in the dialogue between academia and business and seamless migration into business world. DDDM2007 will promote KDD research paradigm shift from data-driven hidden pattern mining to domain-driven actionable knowledge discovery. It will boost seamless, reliable and actionable capability of research findings when deployed to tackle real-world challenges, which can satisfy business needs and support decision actions.

2 Workshop website

URL: http://datamining.it.uts.edu.au/dddm/

3 Contact

Dr. Longbing Cao (lbcao@it.uts.edu.au)

International Workshop on Multimedia Data Mining

Abstract:

"Multimedia information is ubiquitous and essential in many applications from homeland security to medicine and bioinformatics. As evidenced by the success of the previous editions of MDM/KDD, there is an increasing need in new techniques and tools that can detect and discover patterns, in multimedia data, that can lead to new knowledge. For example, tools are needed for discovering relationships between objects or segments within images, classifying images based on their content, extracting patterns in sound, categorizing speech and music, and recognizing and tracking objects in video streams. There is also an increasing interest in the real-time analysis of multimedia data generated by distributed sensory applications and ambient intelligence environments. MDM is a leading venue, where researchers, both from the academia and industry, can exchange and compare both relatively mature and green house theories, methodologies, algorithms and frameworks for multimedia data mining. To address this aim, the workshop brings together experts in the analysis of digital media content, multimedia databases, multimedia information retrieval, and domain experts from different applied disciplines with potential in multimedia data mining and knowledge discovery. Like the previous editions, MDM 2007 will aim facilitating cross-disciplinary exchange of ideas. Topics of interest include (but are not limited to) integrated mining of different data formats (text, speech, video, images, relational data); combining mining results from different sources; theoretical frameworks for multimedia data mining; topic and event detection in multimedia data; extracting semantics from multimedia databases; mining scientific multimedia data; man-machine interfaces for multimedia data mining; complexity, efficiency and scalability of multimedia data mining algorithms; data mining virtual communities and virtual worlds; and real-time multimedia data mining systems."

The URL for the MDM07 workshop is: http://aria.asu.edu/mdm07/.

Mining Multiple Information Sources

Abstract:

Recent developments in storage technology and network architectures have made it possible and affordable for scientific institutes, commercial enterprises, and government agencies to gather and store data from multiple sources. The increasing globalization has also demanded that many business applications involve storing information at geographically distributed locations for analysis. Although the capability of distributed data storage brings us opportunities to improve the quality of data management and decision making, the nature of these distributed data repositories also generates significant challenges for inter-repository pattern discovery. Here, we list three major ones: (1) how to efficiently identify quality knowledge from a single data source, where patterns reveal local knowledge for each particular data repository, commonly referred to as local patterns; (2) how to integrate and unify multiple information sources into one single view such that previous unseen patterns can be discovered, commonly referred to as global patterns; and (3) how to discover the relationships of the patterns hidden across multiple information sources, where the features of the patterns (such as pattern frequencies and their utilities) across different data repositories define inter-repository relationships, which we refer to as inter patterns.

The aim of this workshop is to bring together data mining experts to revisit the problem of pattern discovery from multiple information sources, and identify and synthesize current needs for such purposes. Representative questions to be addressed include but are not limited to: (1) Mining from heterogeneous information sources; (2) Local pattern analysis and fusion; (3) Global pattern synthesizing and assessment; (4) Inter pattern discovery and comparison; (5) Security and privacy issues in multiple information sources; and (6) Interactive data mining systems

Workshop website: http://www.cse.fau.edu/~xqzhu/mmis/kdd07_mmis.html

Workshop and Challenge on Time Series Classification

Time series classification is useful in its own right in medical, scientific and business domains. Time series classification is also a useful subroutine in other algorithms, such as novelty detection, motif discovery etc. Each year, there are many papers on time series classification in SIGKDD, ICDM, ICML, SIGMOD etc. There are now at least 100 different methods for time series classification out there. But which method is best for what kind of data? This question is still an area of confusion and contention.

This workshop/competition hopes to go some way toward answering the above question and to stimulate new research in this important area.

The format of the workshop will be short paper presentations, following by an extensive debriefing and open discussion on the competition results.

Please visit the link http://www.cs.ucr.edu/~eamonn/SIGKDD2007TimeSeries.html for more information on this workshop.

An archive of the KDD 2007 Workshops Call for Proposal can be found here.