Tensors are a natural representation for multidimensional data. In recent years, CANDECOMP/PARAFAC (CP) decomposition, one of the most popular tools for analyzing multi-way data, has been extensively studied and widely applied. However, today’s datasets are often dynamically changing over time. Tracking the CP decomposition for such dynamic tensors is a crucial but challenging task, due to the large scale of the tensor and the velocity of new data arriving. Traditional techniques, such as Alternating Least Squares (ALS), cannot be directly applied to this problem because of their poor scalability in terms of time and memory. Additionally, existing online approaches have only partially addressed this problem and can only be deployed on third-order tensors. To fill this gap, we propose an efficient online algorithm that can incrementally track the CP decompositions of dynamic tensors with an arbitrary number of dimensions. In terms of effectiveness, our algorithm demonstrates comparable results with the most accurate algorithm, ALS, whilst being computationally much more efficient. Specifically, on small and moderate datasets, our approach is tens to hundreds of times faster than ALS, while for large-scale datasets, the speedup can be more than 3,000 times. Compared to other state-of-the-art online approaches, our method shows not only significantly better decomposition quality, but also better performance in terms of stability, efficiency and scalability.

Filed under: Dimensionality Reduction | Big Data