Abstract
Knowledge graphs have important applications for many computational tasks, such as personalized recommendations, information search, and natural language processing. Knowledge graph embedding, which is to learn representations of nodes and relations, is very critical to facilitate these applications and thus has been studied extensively in the literature. Most existing knowledge graphs (e.g., Freebase) have a data scarcity issue, i.e., the number of observed triplets is much less than that of all possible pairs of nodes. While the data augmentation technique has been widely applied to addressing data scarcity in other domains (e.g., image data), there are few prior studies exploring it for knowledge graph embedding probably because the discrete data structure of knowledge graphs prohibits the employment of most data augmentation methods. To fill this research gap, this paper introduces a novel data augmentation framework, namely knowledge graph mixup (KG Mixup), to enhance knowledge graph embedding. Based on the proposed framework, we develop two specific methods: vanilla mixup and influence mixup. Both approaches generate virtual mixup triplets and incorporate them into the learning process through a new mixup loss function. While vanilla mixup generates virtual triplets based on a uniform distribution, the influence mixup approach employs the influence function to guide the generation of mixup samples. Experiments with multiple datasets have shown that both approaches significantly outperform knowledge graph embedding models trained by the ordinary training framework.
Original language | English (US) |
---|---|
Pages (from-to) | 569-580 |
Number of pages | 12 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 36 |
Issue number | 2 |
DOIs | |
State | Published - Feb 1 2024 |
Keywords
- Knowledge graph
- data augmentation
- knowledge graph completion
- knowledge representation
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics