KDD Cup 2016: Whose papers are accepted the most: towards measuring the impact of research institutions

Finding influential nodes in a social network for identifying patterns or maximizing information diffusion has been an actively researched area with many practical applications. In addition to the obvious value to the advertising industry, the research community has long sought mechanisms to effectively disseminate new scientific discoveries and technological breakthroughs so as to advance our collective knowledge and elevate our civilization. For students, parents and funding agencies that are planning their academic pursuits or evaluating grant proposals, having an objective picture of the institutions in question is particularly essential. Partly against this backdrop we have witnessed that releasing a yearly Research Institution or University Ranking has become a tradition for many popular newspapers, magazines and academic institutes. Such rankings not only attract attention from governments, universities, students and parents, but also create debates on the scientific correctness behind the rankings. The most criticized aspect of these rankings is: the data used and the methodology employed for the ranking are mostly unknown to the public.

The 2016 KDD Cup will address this very important problem through publically available datasets, like the Microsoft Academic Graph (MAG), a freely available dataset that includes information on academic publications and citations. This dataset, being a heterogeneous graph, that can be used to study the influential nodes of various types including authors, affiliations and venues; we choose to focus on affiliations in this competition. In effect, given a research field, we are challenging the KDD Cup community to jointly develop data mining techniques to identify the best research institutions based on their publication and how they are cited in research articles.

