KDD Papers

Large-scale Collaborative Ranking in Near-Linear Time

Liwei Wu (University of California, Davis);Cho-Jui Hsieh (University of California, Davis);James Sharpnack (University of California, Davis)


Abstract

In this paper, we consider the Collaborative Ranking (CR) problem for recommendation systems. Given a set of pairwise preferences between items for each user, collaborative ranking can be used to rank un-rated items for each user, and this ranking can be naturally used for recommendation. It is observed that collaborative ranking algorithms usually achieve better recommendations since it direct minimizes the ranking loss; however, they are rarely used in practice due to the poor scalability. All the existing CR algorithms have time complexity at least O(|\Omega|r) per iteration, where r is the target rank and |\Omega| is number of pairs that grows quadratically with number of ratings per user. For example, the Netflix data contains totally 20 billion rating pairs, and in this scale all the current algorithms have to work on subsamples, resulting in poor prediction on testing data. In this paper, we propose a new collaborative ranking algorithm called Primal-CR that reduces the time complexity to O(|\Omega|+d_1 \bar{d}_2 r), where d_1 is number of users and \bar{d}_2 is the averaged number of items rated by a user. Note that d_1 \bar{d}_2 is strictly smaller and often much smaller than |\Omega|. Furthermore, by exploiting the fact that most data is in the form of numerical ratings instead of pairwise comparisons, we propose Primal-CR++ with O(d_1\bar{d}_2 (r+ \log \bar{d}_2 )) time complexity. Both algorithms have better theoretical time complexity than existing approaches and also outperform existing approaches in terms of NDCG and pairwise error on real data sets. To the best of our knowledge, we are the first one in the collaborative ranking setting to apply the algorithm to the full Netflix dataset using all the 20 billion ratings, and this leads to a model with much better recommendation compared with previous models trained on subsamples. Finally, compared with classical matrix factorization algorithm which also requires O(d_1 \bar{d}_2 r) time, our algorithm has almost the same efficiency while making much better recommendation since we consider the ranking loss.


Comments