Embedding-based News Recommendation for Millions of Users
Shumpei Okura (Yahoo! JAPAN);Yukihiro Tagami (Yahoo Japan Corporation);Shingo Ono (Yahoo Japan Corporation);Akira Tajima (Yahoo! Japan)
For effective news recommendation, it is necessary to understand content of articles and preferences of users. While ID-based methods such as collaborative filtering and low rank factorization are well-known approaches for recommendation, such methods are not suitable for news recommendation, because candidate articles expire quickly and replaced by new ones in a short span. Word-based approaches, often used in information retrieval settings, are good candidates in terms of system performance, but have some challenges such as coping with synonyms and orthographical variants and defining “queries” from users’ historical activities. In this paper,we propose an embedding-based approach to use distributed representations in an end-to-end manner: (i) start with distributed representations of articles based on a variant of denoising autoencoder, (ii) generate user representations by a recurrent neural network (RNN) with browsing histories as input sequences, and (iii) match and list articles for each user based on inner product operations in consideration of system performance. The proposed method showed good performance in the offline evaluation using past access data on Yahoo! JAPAN’s homepage. In response to the experimental result, we implemented it to the actual system and compared online performance with the word-based approach that had incorporated in the system traditionally. As a result, CTR and total duration improved by 23 % and 10 % individually compared with the word-based approach. Services incorporating the proposed method are already open to all users and provide recommendations to over ten million unique users per day and billions of accesses per month.