KDD Cup 2005: Internet user search query categorization

Can we submit a separate solution for each evaluation criteria (award)?

Yes. When submitting, you can include, in the compressed file, multiple files with one for each solution. In this case, you need to specify which file is for which award clearly. However, your solution dedicated to precision must be in the top 10 F1 scores in order to become the candidate of "Query Categorization Precision Award". Also, in the algorithm description part, you need to clearly specify which algorithm was used for which award. Note that only one solution is allowed for each of the "Query Categorization Precision Award" and "Query Categorization Performance Award" from any participant team. The Query Categorization Creativity Award" will be based on the description of the algorithm(s) used for the other two awards.

Are we allowed to use external sources (e.g. documents from directories) to increase the knowledge of the classifier?

Yes, you will decide what methodology or resource to use to classify the queries. There is no restriction on what data you can/can't use to build your models.

How to label trash queries or non-English queries?

The evaluation set contains only valid English queries. Participants may have their system return no labels on this type of non-English or trash query.

Do I have to submit an algorithm description?

You need to submit an algorithm description. If you do not want to share the details of your techniques, you can just give a high level description of your approach and please indicate "a brief summary" at the beginning of your description. In this case, you will not participate in "Query Categorization Creativity Award".

How will the evaluation query set be selected?

1. The queries for evaluation will be selected randomly.
2. Foreign language queries / trash queries / improper content queries will be dropped from the evaluation set during the selection process.

