Content to Node: Self-translation Network Embedding
Jie Liu (Nankai University); Zhicheng He (Nankai University); Lai Wei (Nankai University); Yalou Huang (Nankai University)
This paper concerns the problem of network embedding (NE), whose aim is to learn low-dimensional representations for nodes in networks. Such dense vector representations offer great promises for many network analysis problems. However, existing NE approaches are still faced with challenges posed by the characteristics of complex networks in real-world applications. First, for many real-world networks associated with rich content information, previous NE methods tend to learn separated content and structure representations for each node, which requires a post-processing of combination. The empirical and simple combination strategies often make the final vector suboptimal. Second, the existing NE methods preserve the structure information by considering short and fixed neighborhood scope, such as the first- and/or the second-order proximities. However, it is hard to decide the scope of the neighborhood when facing a complex problem. To this end, we propose a novel sequence-to-sequence model based NE framework which is referred to as Self-Translation Network Embedding (STNE) model. With the sequences generated by random walks on a network, STNE learns the mapping that translates each sequence itself from the content sequence to the node sequence. On the one hand, the bi-directional LSTM encoder of STNE fuses the content and structure information seamlessly from the raw input. On the other hand, high-order proximity can be flexibly learned with the memories of LSTM to capture long-range structural information. By such self-translation from content to node, the learned hidden representations can be adopted as node embeddings. Extensive experimental results based on three real-world datasets demonstrate that the proposed STNE outperforms the state-of-the-art NE approaches. To facilitate reproduction and further study, we provide Internet access to the code and datasets\footnotehttp://dm.nankai.edu.cn/code/STNE.rar.