KDD Cup 2010: Student performance evaluation

Competition Rules

Conditions of participation: Anybody who complies with the rules of the challenge (KDD Cup 2010) is welcome to participate. Only the organizers are excluded from participating. The KDD Cup 2010 is part of the competition program of the Knowledge Discovery in Databases conference (KDD 2010), July 25-28 in Washington, DC. Participants are not required to attend the KDD Cup 2010 workshop, which will be held at the conference, and the workshop is open to anyone who registers. The proceedings of the competition will be published in a volume of the JMLR: Workshop and Conference Proceedings series.

Anonymity: All entrants must identify themselves by registering on the KDD Cup 2010 website. However, they may elect to remain anonymous by choosing a nickname and checking the box “Make my profile anonymous”. If this box is checked, only the nickname will appear in the Leaderboard instead of the real name. Participant emails will not appear anywhere on the website and will be used only by the organizers to communicate with the participants. To be eligible for prizes, the participants will have to publicly reveal their identity and uncheck the box “Make my profile anonymous”.

Teams: To register a team, only register the team leader and choose a nickname for your team. We’ll let you know later how to disclose the members of your team. We limit each team to one final entry. As an individual, you cannot enter under multiple names - this would be considered cheating and would disqualify you - nor can you participate under multiple teams. Multiple teams from the same organization, however, are allowed so long as each team leader is a different person and the teams do not intersect. During the development period, each team must have a different registered team leader. To be ranked in the challenge and qualify for prizes, each registered participant (individual or team leader) will have to disclose the names of eventual team members, before the final results of the challenge get released. Hence, at the end of the challenge, you will have to choose to which team you want to belong (only one!), before the results are publicly released. If a person participates in multiple teams, those teams will be disqualified. After the results are released, no change in team composition will be allowed. Before the end of the challenge the team leaders will have to declare the composition of their team. This will have to correspond to the list of co-authors in the proceedings, if they decide to publish their results. Hence a professor cannot have his/her name on all his/her students papers (but can be thanked in acknowledgments).

A team can be either a student team (eligible for student-team prizes) or not a student team (eligible for travel awards). In a student team, a professor should be cited appropriately, but in the spirit of the competition, student teams should consist primarily of student work. We will ask for participants to state whether they are a student team prior to the end of the competition.

Data: Data are available for download from the Data page to registered participants. Each data set is available as a separate archive to facilitate downloading. For viewing accuracy on the Leaderboard, participants may enter results on either or both development and challenge data sets, but results on the development data sets will not count toward the final evaluation.

Challenge duration: The challenge is about 2 months in duration (April 1 - June 8, 2010). To be eligible for prizes, final submissions must be received by June 8 11:59pm EDT (-4 GMT). On-line feedback: On-line feedback is available through the upload results page and Leaderboard.

Submission method: The method of submission is via the form on the Upload page. To be ranked, submissions must include results on test portion only of the challenge or development data sets. Results on the development data sets will not count as part of the competition. Multiple submissions are allowed.

Evaluation and ranking: For each team, only the last valid entry made by the team leader will count towards determining the winner. Valid entries must include results on both challenge data sets. The method of scoring is described on this page.

Reproducibility: Participation is not conditioned on delivering code nor publishing methods. However, we will ask the top ranking participants to voluntarily fill out a fact sheet about their methods, contribute papers to the proceedings, and help in reproducing their results.

Prizes: Thanks to our sponsors, Facebook, Elsevier, and IBM Research, we will be offering the following prizes to student teams:

Prize amounts increased on April 23, 2010:

First place: $5500

Second place: $3000

Third place: $1500

The Pittsburgh Science of Learning Center (PSLC) will provide the following travel awards to cover expenses related to attending the KDD Cup 2010 workshop at the KDD conference:

Overall first place: $1700

Overall second place: $1150

Overall third place: $650

Student first place: $1700

Student second place: $1150

Student third place: $650

