TY - GEN
T1 - High-order proximity preserving information network hashing
AU - Lian, Defu
AU - Zheng, Kai
AU - Cao, Longbing
AU - Zheng, Vincent W.
AU - Tsang, Ivor W.
AU - Ge, Yong
AU - Xie, Xing
N1 - Funding Information:
Defu Lian is supported by the National Natural Science Foundation of China (Grant No. 61502077 and 61631005) and the Fundamental Research Funds for the Central Universities (Grant No. ZYGX2016J087), Kai Zheng is supported by the National Natural Science Foundation of China (Grant No. 61532018 and 61502324). Ivor Tsang is supported by the Australian Research Council (Grant No. FT130100746, DP180100106 and LP150100671). Vincent Zheng is supported by National Research Foundation, Prime Minister's Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) programme, and Alibaba Innovative Research program.
Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Information network embedding is an effective way for efficient graph analytics. However, it still faces with computational challenges in problems such as link prediction and node recommendation, particularly with increasing scale of networks. Hashing is a promising approach for accelerating these problems by orders of magnitude. However, no prior studies have been focused on seeking binary codes for information networks to preserve high-order proximity. Since matrix factorization (MF) unifies and outperforms several well-known embedding methods with high-order proximity preserved, we propose a MF-based Information Network Hashing (INH-MF) algorithm, to learn binary codes which can preserve high-order proximity. We also suggest Hamming subspace learning, which only updates partial binary codes each time, to scale up INH-MF. We finally evaluate INH-MF on four real-world information network datasets with respect to the tasks of node classification and node recommendation. The results demonstrate that INH-MF can perform significantly better than competing learning to hash baselines in both tasks, and surprisingly outperforms network embedding methods, including DeepWalk, LINE and NetMF, in the task of node recommendation. The source code of INH-MF is available online1
AB - Information network embedding is an effective way for efficient graph analytics. However, it still faces with computational challenges in problems such as link prediction and node recommendation, particularly with increasing scale of networks. Hashing is a promising approach for accelerating these problems by orders of magnitude. However, no prior studies have been focused on seeking binary codes for information networks to preserve high-order proximity. Since matrix factorization (MF) unifies and outperforms several well-known embedding methods with high-order proximity preserved, we propose a MF-based Information Network Hashing (INH-MF) algorithm, to learn binary codes which can preserve high-order proximity. We also suggest Hamming subspace learning, which only updates partial binary codes each time, to scale up INH-MF. We finally evaluate INH-MF on four real-world information network datasets with respect to the tasks of node classification and node recommendation. The results demonstrate that INH-MF can perform significantly better than competing learning to hash baselines in both tasks, and surprisingly outperforms network embedding methods, including DeepWalk, LINE and NetMF, in the task of node recommendation. The source code of INH-MF is available online1
KW - Hamming Subspace Learning
KW - Information Network Hashing
KW - Matrix Factorization
UR - http://www.scopus.com/inward/record.url?scp=85051461356&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051461356&partnerID=8YFLogxK
U2 - 10.1145/3219819.3220034
DO - 10.1145/3219819.3220034
M3 - Conference contribution
AN - SCOPUS:85051461356
SN - 9781450355520
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1744
EP - 1753
BT - KDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
Y2 - 19 August 2018 through 23 August 2018
ER -