E-tail product return prediction via hypergraph-based local graph cut
Jianbo Li (Three Bridges Capital); Jingrui He (Arizona State University); Yada Zhu (IBM)
Recent decades have witnessed the rapid growth of E-commerce. In particular, E-tail has provided customers with great convenience by allowing them to purchase retail products anywhere without visiting the actual stores. A recent trend in E-tail is to allow free shipping and hassle-free returns to further attract online customers. However, a downside of such a customer-friendly policy is the rapidly increasing return rate as well as the associated costs of handling returned online orders. Therefore, it has become imperative to take proactive measures for reducing the return rate and the associated cost. Despite the large amount of data available from historical purchase and return records, up until now, the problem of E-tail product return prediction has not attracted much attention from the data mining community.
To address this problem, in this paper, we propose a generic framework for E-tail product return prediction named HyperGo . It aims to predict the customer’s intention to return after s/he has put together the shopping basket. For the baskets with a high return intention, the E-tailers can then take appropriate measures to incentivize the customer not to issue a return and/or prepare for reverse logistics. The proposed HyperGo is based on a novel hypergraph representation of historical purchase and return records, effectively leveraging the rich information of basket composition. For a given basket, we propose a local graph cut algorithm using truncated random walk on the hypergraph to identify similar historical baskets. Based on these baskets, HyperGo is able to estimate the return intention on two levels: basket-level vs. product-level, which provides the E-tailers with detailed information regarding the reason for a potential return (e.g., duplicate products with different colors). One major benefit of the proposed local algorithm lies in its time complexity, which is linearly dependent on the size of the output cluster and polylogarithmically dependent on the volume of the hypergraph. This makes HyperGo particularly suitable for processing large-scale data sets. The experimental results on multiple real-world E-tail data sets demonstrate the effectiveness and efficiency of HyperGo.