Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding
Pith reviewed 2026-05-10 03:14 UTC · model grok-4.3
The pith
Entity interference from new knowledge graph entities causes current continual embedding methods to be overestimated by up to 25 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that entity interference, in which newly introduced entity embeddings displace previously correct predictions on old queries, is a distinct source of degradation in continual knowledge graph embedding that current methods and evaluation protocols ignore. Accounting for it demonstrates that existing CKGE techniques overestimate their resistance to forgetting, with the gap reaching 25 percent under substantial entity growth, and that different embedding models and mitigation strategies are affected unequally by interference versus conventional forgetting.
What carries the argument
Entity interference: the process by which embeddings trained on newly added entities cause the model to output those new entities as answers for queries involving only previously learned entities.
If this is right
- CKGE methods that only constrain updates to existing embeddings leave models vulnerable to interference from new entities.
- Evaluation protocols for continual KG embedding must test whether new entities are wrongly predicted on old queries.
- The size of reported performance gaps between methods can shift once interference is measured separately from standard forgetting.
- A forgetting metric that distinguishes entity interference from other sources gives a more accurate picture of method quality.
Where Pith is reading between the lines
- Future work could design mitigation strategies that explicitly regularize interactions between new and old entity embeddings rather than treating all new learning uniformly.
- The same interference pattern may appear in other continual embedding tasks where the set of items or labels expands over time, such as word embeddings or recommendation systems.
- Benchmarks that grow entities at different rates could be used to quantify how much interference scales with growth speed.
Load-bearing premise
The performance drops seen after new entities arrive are driven primarily by interference from their embeddings rather than by other unmeasured factors, and the corrected protocol isolates this effect without introducing fresh biases.
What would settle it
Re-running the benchmarks with the corrected protocol and finding that performance estimates for existing CKGE methods change by far less than 25 percent, or that the interference effect disappears, would show the claim does not hold.
Figures
read the original abstract
Knowledge Graph Embeddings (KGEs) support a wide range of downstream tasks over Knowledge Graphs (KGs). In practice, KGs evolve as new entities and facts are added, motivating Continual Knowledge Graph Embedding (CKGE) methods that update embeddings over time. Current CKGE approaches address catastrophic forgetting (i.e., the performance degradation on previously learned tasks) primarily by limiting changes to existing embeddings. However, we show that this view is incomplete. When new entities are introduced, their embeddings can interfere with previously learned ones, causing the model to predict them in place of previously correct answers. This phenomenon, which we call entity interference, has been largely overlooked and is not accounted for in current CKGE evaluation protocols. As a result, the assessment of catastrophic forgetting becomes misleading, and CKGE methods performance is systematically overestimated. To address this issue, we introduce a corrected CKGE evaluation protocol that accounts for entity interference. Through experiments on multiple benchmarks, we show that ignoring this effect can lead to performance overestimation of up to 25%, particularly in scenarios with significant entity growth. We further analyze how different CKGE methods and KGE models are affected by the different sources of forgetting, and introduce a catastrophic forgetting metric tailored to CKGE.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that current CKGE methods and evaluations focus narrowly on catastrophic forgetting via embedding changes to existing entities, but overlook 'entity interference' in which new-entity embeddings degrade predictions for old entities. This leads to systematic overestimation of CKGE performance (up to 25%) under standard link-prediction protocols, especially with entity growth. The authors introduce a corrected evaluation protocol that accounts for this interference, a CKGE-specific forgetting metric, and empirical results across multiple benchmarks showing how different methods and base KGE models are affected by the two sources of forgetting.
Significance. If the central claim and protocol hold, the work would meaningfully advance continual learning for knowledge graphs by identifying an additional, previously unmeasured source of performance degradation distinct from standard forgetting. The empirical scope across benchmarks and the introduction of a tailored metric are positive; reproducible code or parameter-free derivations are not mentioned.
major comments (2)
- [Abstract / Evaluation Protocol] Abstract and Evaluation section: the claim that the corrected protocol isolates entity interference (distinct from standard catastrophic forgetting) is load-bearing for the 25% overestimation result, yet the manuscript provides no explicit description of how the protocol modifies rankings or candidate sets to avoid conflating interference with metric artifacts from larger entity pools and shifted answer distributions in growing KGs.
- [Experiments] Experiments section: no ablation is described that demonstrates the performance gap vanishes under fixed-entity controls (no new entities) but reappears only when new-entity embeddings are present; without this, it remains possible that observed differences arise from test-distribution shifts rather than embedding-space interference.
minor comments (2)
- [Abstract] The abstract states experiments on 'multiple benchmarks' but gives no dataset names, sizes, or growth rates, making it difficult to assess the scope of the entity-growth scenarios.
- [Metric Definition] Notation for the new forgetting metric is introduced without an explicit equation or comparison to existing CKGE forgetting measures (e.g., average accuracy drop).
Simulated Author's Rebuttal
We thank the referee for their constructive and insightful comments, which help clarify key aspects of our evaluation protocol and experimental validation. We address each major comment point by point below and will incorporate the suggested clarifications and additions in the revised manuscript.
read point-by-point responses
-
Referee: [Abstract / Evaluation Protocol] Abstract and Evaluation section: the claim that the corrected protocol isolates entity interference (distinct from standard catastrophic forgetting) is load-bearing for the 25% overestimation result, yet the manuscript provides no explicit description of how the protocol modifies rankings or candidate sets to avoid conflating interference with metric artifacts from larger entity pools and shifted answer distributions in growing KGs.
Authors: We agree that the current description of the corrected protocol in the Evaluation section could be more explicit to fully substantiate the isolation of entity interference. In the revised manuscript, we will add a dedicated subsection that precisely details how the protocol modifies candidate sets and rankings—specifically by restricting or reweighting predictions to exclude interference from newly introduced entities while preserving the original entity pool for old-entity evaluation. This will distinguish the interference effect from artifacts due to larger entity pools or shifted answer distributions, thereby strengthening the attribution of the up to 25% overestimation to entity interference rather than evaluation mechanics. revision: yes
-
Referee: [Experiments] Experiments section: no ablation is described that demonstrates the performance gap vanishes under fixed-entity controls (no new entities) but reappears only when new-entity embeddings are present; without this, it remains possible that observed differences arise from test-distribution shifts rather than embedding-space interference.
Authors: We acknowledge the value of this ablation for ruling out alternative explanations. While our experiments already compare CKGE performance across benchmarks with varying entity growth rates, we did not include an explicit fixed-entity control condition. In the revision, we will add a new ablation experiment that holds the entity set fixed (no new entities introduced) and demonstrates that performance gaps are negligible in this case but re-emerge when new-entity embeddings are allowed to interfere. This will provide direct evidence that the observed differences arise from embedding-space interference rather than test-distribution shifts alone. revision: yes
Circularity Check
No significant circularity; empirical protocol and metric are independently motivated
full rationale
The paper is an empirical study that identifies entity interference via benchmark experiments, proposes a corrected evaluation protocol to account for it, and reports up to 25% overestimation when ignored. No mathematical derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps exist in the provided text. The central claims rest on observable performance gaps across multiple KGs and methods rather than reducing to definitions or prior self-citations by construction. The evaluation protocol is presented as a practical correction derived from the observed phenomenon, not as a tautological re-expression of inputs.
Axiom & Free-Parameter Ledger
invented entities (1)
-
entity interference
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J., 2008. Freebase: a collaboratively created graph database for structuring human knowledge, in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 1247–1250
work page 2008
-
[2]
Translating embeddings for modeling multi-relational data
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O., 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26
work page 2013
-
[3]
Cui,Y.,Wang,Y.,Sun,Z.,Liu,W.,Jiang,Y.,Han,K.,Hu,W.,2023. Lifelongembeddinglearningandtransferforgrowingknowledgegraphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4217–4224
work page 2023
-
[4]
Ebisu, T., Ichise, R., 2018. Toruse: Knowledge graph embedding on a lie group, in: Proceedings of the AAAI conference on artificial intelligence
work page 2018
-
[5]
Grossberg, S., 2013. Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural networks 37, 1–47
work page 2013
-
[6]
Simple embedding for link prediction in knowledge graphs
Kazemi, S.M., Poole, D., 2018. Simple embedding for link prediction in knowledge graphs. Advances in neural information processing systems 31
work page 2018
-
[7]
Overcoming catastrophic forgetting in neural networks
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al., 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 3521–3526
work page 2017
-
[8]
Lai,N.,Jin,K.,Long,Y.,Yu,W.,Huang,J.,2025. Ahybridlearningapproachforcontinualknowledgegraphembedding:Contrastivemasking and joint anti-forgetting, in: International Conference on Artificial Neural Networks, Springer. pp. 77–88
work page 2025
-
[9]
Li, Y., Zhang, L., Yan, H., Zhao, T., Ma, Z., Huang, M., Liu, J., 2025. Sage: Scale-aware gradual evolution for continual knowledge graph embedding, in: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2, pp. 1600–1611
work page 2025
-
[10]
Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X., 2015. Learning entity and relation embeddings for knowledge graph completion, in: Proceedings of the AAAI conference on artificial intelligence
work page 2015
-
[11]
Liu, J., Ke, W., Wang, P., Shang, Z., Gao, J., Li, G., Ji, K., Liu, Y., 2024a. Towards continual knowledge graph embedding via incremental distillation, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 8759–8768
-
[12]
Fast and continual knowledge graph embedding via incremental lora, in: IJCAI
Liu, J., Ke, W., Wang, P., Wang, J., Gao, J., Shang, Z., Li, G., Xu, Z., Ji, K., Li, Y., 2024b. Fast and continual knowledge graph embedding via incremental lora, in: IJCAI
-
[13]
Gradientepisodicmemoryforcontinuallearning
Lopez-Paz,D.,Ranzato,M.,2017. Gradientepisodicmemoryforcontinuallearning. Advancesinneuralinformationprocessingsystems30. G. Pons et al.:Preprint submitted to ElsevierPage 15 of 18 Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding
work page 2017
-
[14]
How does knowledge evolve in open knowledge graphs? Transactions on Graph Data and Knowledge 1, 11–1
Polleres, A., Pernisch, R., Bonifati, A., Dell’Aglio, D., Dobriy, D., Dumbrava, S., Etcheverry, L., Ferranti, N., Hose, K., Jiménez-Ruiz, E., et al., 2023. How does knowledge evolve in open knowledge graphs? Transactions on Graph Data and Knowledge 1, 11–1
work page 2023
-
[15]
Shi,X.,Mu,C.,Tian,L.,Yan,B.,Xiao,W.,Wang,J.,2025. Acontinualknowledgegraphembeddingmethodbasedonlocal-globaldistillation, in: 2025 8th International Symposium on Big Data and Applied Statistics (ISBDAS), IEEE. pp. 798–802
work page 2025
-
[16]
Song, A., Chen, Y., Wang, Y., Zhong, S., Xu, M., 2024. Orchestrating plasticity and stability: A continual knowledge graph embedding framework with bio-inspired dual-mask mechanism, in: The 16th Asian Conference on Machine Learning (Conference Track)
work page 2024
-
[17]
Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J., 2019. Rotate: Knowledge graph embedding by relational rotation in complex space, in: International Conference on Learning Representations
work page 2019
-
[18]
Toutanova, K., Chen, D., 2015. Observed versus latent features for knowledge base and text inference, in: Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pp. 57–66
work page 2015
-
[19]
Knowledge graph completion via complex tensor factorization
Trouillon, T., Dance, C.R., Gaussier, É., Welbl, J., Riedel, S., Bouchard, G., 2017. Knowledge graph completion via complex tensor factorization. Journal of Machine Learning Research 18, 1–38
work page 2017
-
[20]
van de Ven, Tinne Tuytelaars, and Andreas S
van de Ven, G.M., Tuytelaars, T., Tolias, A.S., 2022. Three types of incremental learning. Nature Machine Intelligence 4, 1185–1197. URL: https://doi.org/10.1038/s42256-022-00568-3, doi:10.1038/s42256-022-00568-3
-
[21]
Wikidata: a free collaborative knowledgebase
Vrandečić, D., Krötzsch, M., 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM 57, 78–85
work page 2014
-
[22]
Wang, H., Xiong, W., Yu, M., Guo, X., Chang, S., Wang, W.Y., 2019. Sentence embedding alignment for lifelong relation extraction, in: Burstein, J., Doran, C., Solorio, T. (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), ...
-
[23]
Wang, L., Liu, Y., Lu, J., Guo, X., Li, Z., Huang, N., 2026. Taming non-stationary knowledge growth: Dynamic global memory framework for lifelong knowledge graph embedding, in: Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pp. 714–723
work page 2026
-
[24]
Acomprehensivesurveyofcontinuallearning:Theory,methodandapplication
Wang,L.,Zhang,X.,Su,H.,Zhu,J.,2024. Acomprehensivesurveyofcontinuallearning:Theory,methodandapplication. IEEEtransactions on pattern analysis and machine intelligence 46, 5362–5383
work page 2024
-
[25]
Knowledge graph embedding: A survey of approaches and applications
Wang, Q., Mao, Z., Wang, B., Guo, L., 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE transactions on knowledge and data engineering 29, 2724–2743
work page 2017
-
[26]
Stckge:Continualknowledgegraphembeddingbasedonspatial transformation
Wang,X.,Liu,J.,Xie,K.,Wang,M.,Bi,C.,Deng,J.,Ji,D.,Pan,J.Z.,2025. Stckge:Continualknowledgegraphembeddingbasedonspatial transformation. Knowledge-Based Systems 329, 114337. doi:https://doi.org/10.1016/j.knosys.2025.114337
-
[27]
Wang, Z., Zhang, J., Feng, J., Chen, Z., 2014. Knowledge graph embedding by translating on hyperplanes, in: Proceedings of the AAAI conference on artificial intelligence
work page 2014
-
[28]
Yang, B., Yih, W., He, X., Gao, J., Deng, L., 2015. Embedding entities and relations for learning and inference in knowledge bases, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings
work page 2015
-
[29]
Yang,J.,Jiang,X.,Jiang,X.,Gao,Y.,Yang,L.T.,Zou,S.,Yang,S.,2025. Fromknowledgeforgettingtoaccumulation:Evolutionaryrelation path passing for lifelong knowledge graph embedding, in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1197–1206
work page 2025
-
[30]
Zenke, F., Poole, B., Ganguli, S., 2017. Continual learning through synaptic intelligence, in: International conference on machine learning, PMLR. pp. 3987–3995
work page 2017
-
[31]
Quaternion knowledge graph embeddings
Zhang, S., Tay, Y., Yao, L., Liu, Q., 2019. Quaternion knowledge graph embeddings. Advances in neural information processing systems 32
work page 2019
-
[32]
Zhao, T., Chen, J., Ru, Y., Lin, Q., Geng, Y., Zhu, H., Pan, Y., Liu, J., 2025. Rethinking continual knowledge graph embedding: Benchmarks and analysis, in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–147
work page 2025
-
[33]
Zhu, J., Fu, B., Duan, G., 2025a. Debiasedkge: Towards mitigating spurious forgetting in continual knowledge graph embedding, in: Proceedings of the 34th ACM International Conference on Information and Knowledge Management, pp. 4519–4528
-
[34]
Zhu, L., Jeon, D.H., Sun, W., Yang, L., Xie, Y., Niu, S., 2024. Flexible memory rotation (fmr): Rotated representation with dynamic regularization to overcome catastrophic forgetting in continual knowledge graph learning, in: 2024 IEEE International Conference on Big Data (BigData), IEEE. pp. 6180–6189
work page 2024
-
[35]
Zhu,L.,Lan,Q.,Tian,Q.,Sun,W.,Yang,L.,Xia,L.,Xie,Y.,Xiao,X.,Duan,T.,Tao,C.,etal.,2025b. Ett-ckge:Efficienttask-driventokens for continual knowledge graph embedding, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer. pp. 481–496. G. Pons et al.:Preprint submitted to ElsevierPage 16 of 18 Revisiting Catastrophic...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.