Abstract
Information network embedding is an important way to enable efficient graph analytics. However, it still faces with computational challenges in problems such as link prediction and node recommendation, particularly with the increasing scale of networks. Both hashing and quantization are promising approaches for accelerating these problems by orders of magnitude. In the preliminary work, we have proposed to learn binary codes for information networks, but graph analytics may suffer from large accuracy degradation. To reduce information loss while achieving memory and search efficiency, we further propose to learn quantized codes for information networks. In particular, each node is represented by compositing multiple latent vectors, each of which is optimally selected from a distinct set. Since (generalized) matrix factorization unifies several well-known embedding methods with high-order proximity preserved, we propose a Network Representation Lightening framework based on Matrix Factorization (NRL-MF) to learn binary and quantized codes. We also propose an alternating optimization algorithm for efficient parameter learning, even for the generalized matrix factorization case. We finally evaluate NRL-MF on four real-world information network datasets with respect to the tasks of node classification and node recommendation. The results show that NRL-MF significantly outperforms competing baselines in both tasks, and that quantized representations indeed incur much smaller information loss than binarized codes.
Original language | English (US) |
---|---|
Pages (from-to) | 5119-5131 |
Number of pages | 13 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 35 |
Issue number | 5 |
DOIs | |
State | Published - May 1 2023 |
Keywords
- Lightweight representation
- learning to hash
- matrix factorization
- network embedding
- quantization
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics