This Call for Proposals invites industrial or academic institutions and non-profit organizations to submit their proposals for organizing the 2022 KDD Cup competition. Since 1997, KDD Cup has been the premier annual Data Mining competition held in conjunction with the ACM SIGKDD conference on Knowledge Discovery and Data Mining. The KDD Cup competition is anticipated to last for 3 months, and the winners will be notified by mid-June 2022. The winners will be honored at the KDD conference opening ceremony and will present their solutions at the KDD Cup workshop during the conference. We are looking for strong proposals that meet the following requirements: a novel and motivated goal, an interesting challenge, and a broad outreach for the data science community, a rigid and fair setup, a challenging yet manageable task, and accessibility to the KDD community. A broad, positive societal impact of the proposed problem is encouraged:
- A novel and motivated problem. Of particular interest are tasks that are addressing real-world problems, are novel to the data science community, and call for novel approaches to solve them. We are particularly looking for problems that are different from typical machine learning challenges in the recent years. Please include a specific section justifying how your proposed problem will encourage new research and push the field forward.
- A rigid and fair setup. The organizers should guarantee the availability of the data and the confidentiality of the test set (to prevent information leakages at any cost). The evaluation metrics should be both meaningful for the application in-hand and statistically sound for the objective comparison. The baseline should be established to show that non-trivial results can be achieved. An estimate of what constitutes a significant difference in the performance will be much appreciated. We are interested in evaluation that can speak to how ML performance aligns with real world impact. We would be particularly interested in how the winning solutions could be further evaluated in more realistic settings, and how the success of these approaches in practice could be discussed.
- A challenging yet manageable task. The task should be challenging in the sense that there is enough room for improvement from the basic solutions, and novel ideas are required to succeed in the competition. The task should be manageable in about 3 months' time.
- Accessibility. The notions presented in the description should make the competition accessible to the majority of machine learning and data mining practitioners who might not have significant prior domain knowledge or access to a large amount of computational infrastructure. The proposal should discuss how domain expertise can be factored in or any simplifications made to decrease the need for domain expertise.
- Proposals should cover all the important details such as dates, submission and evaluation of results, etc. and describe the competition rules clearly. As a rule of thumb, prepare a proposal as close as possible to the version you would publish on the competition’s webpage.
Please follow the following template for your proposal submission:
Please keep the proposal concise and strictly confidential. Please send your proposals in the PDF format to email@example.com by the submission deadline. Follow the updates provided on the website.
- Problem description. Describe the problem. Justify why this is an important and novel problem. In particular, please elaborate how your proposed problem is different from the previous competitions in recent years. Additionally, please include a discussion of the broader impact of this problem. Please prepare some data samples or scenarios of your proposed problem. If you plan to include more than one track, please describe the unique value for each track.
- Evaluation. Describe how you plan to evaluate the submission. We encourage you to think about how the evaluation aligns with real world impact. We are particularly interested if additional evaluation on the winning submissions can be conducted in the real world after the competition.
- Timeline. Start of the competition, user registration, team formation, submission, evaluation, and notification. You can consider two rounds of submissions if suitable.
- Awards. We encourage you to think about awards beyond the money prize.
- Implementation Details:
- Competition infrastructure. Which competition infrastructure do you plan to use (e.g., Kaggle, or on your own)? Is the competition platform you chose equally accessible to participants all over the world?
- Team work. Explain how the host will organize a team dedicated to the KDD Cup. For each team member, please include a list of their roles, responsibilities, and their commitment.
- Are there any privacy concerns for the released data? Have you obtained the rights to release the data for the competition from your legal counsels? What type of report, presentation, code do you require to submit for the final winning solutions?
- How would you handle Q&A and possible revisions during the competition?
- To which extent you have explored this problem and what is the baseline solution?
- How do you plan to promote the competition?
- Host information:
- Names, affiliations, email addresses, phone numbers, and short biographies of the organizers.
KDD Cup Chairs
Tim Weninger, University of Notre Dame
Jieping Ye, Bieke
Karthik Subbian, Amazon