KDD Papers

Retrospective Higher-Order Markov Processes for User Trails

Tao Wu (Purdue University);David Gleich (Purdue University)


Users form trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to estimate what a user might do next for the purposes of guidance, recommendation, or prefetching. First-order and higher-order Markov chains have been one of the most widely used methods to study such sequences of data. First-order Markov chains are easy to estimate, but lack accuracy when history matters. Higher-order Markov chains suffer from overfitting due to their large numbers of parameters and the sparsity in the training data. Regularized fitting only offers mild improvements to the accuracy. In this paper we propose the retrospective higher-order Markov process (RHOMP) as a low-parameter model for such sequences. This model is a special case of a higher-order Markov chain where the transitions depend retrospectively on a single history state instead of an arbitrary combination of history states. There are two immediate computational advantages: the model complexity only grows linear with the order of the Markov chains and such model scales to large state spaces. Furthermore, by providing a specific structure to the higher-order chain, RHOMPs improve the model accuracy by efficiently utilizing history states without risks of overfitting the data. We demonstrate how to estimate a RHOMP from data and we demonstrate the effectiveness of our method on various real application datasets spanning geolocation data, review sequences, and locations. The RHOMP model uniformly outperforms higher-order Markov chains, Kneser-Ney regularization, and tensor factorizations in terms of prediction accuracy.