pith. sign in

arxiv: 2604.19401 · v1 · submitted 2026-04-21 · 💻 cs.LG · cs.AI

Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding

Pith reviewed 2026-05-10 03:14 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords catastrophic forgettingcontinual learningknowledge graph embeddingsentity interferenceevaluation protocolsknowledge graphsmachine learningbenchmarking
0
0 comments X

The pith

Entity interference from new knowledge graph entities causes current continual embedding methods to be overestimated by up to 25 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Continual knowledge graph embedding updates models as facts and entities are added over time, but standard approaches focus only on stopping changes to old embeddings to avoid forgetting. The paper identifies a separate problem: embeddings for newly added entities can interfere with old ones, causing the model to incorrectly predict new entities on queries that should return old answers. Existing evaluation protocols do not check for this interference, so reported performance on prior tasks looks better than it is. Experiments across benchmarks show the overestimation reaches 25 percent when entity growth is high. The authors supply a corrected protocol plus a forgetting metric designed specifically for this setting.

Core claim

The central claim is that entity interference, in which newly introduced entity embeddings displace previously correct predictions on old queries, is a distinct source of degradation in continual knowledge graph embedding that current methods and evaluation protocols ignore. Accounting for it demonstrates that existing CKGE techniques overestimate their resistance to forgetting, with the gap reaching 25 percent under substantial entity growth, and that different embedding models and mitigation strategies are affected unequally by interference versus conventional forgetting.

What carries the argument

Entity interference: the process by which embeddings trained on newly added entities cause the model to output those new entities as answers for queries involving only previously learned entities.

If this is right

  • CKGE methods that only constrain updates to existing embeddings leave models vulnerable to interference from new entities.
  • Evaluation protocols for continual KG embedding must test whether new entities are wrongly predicted on old queries.
  • The size of reported performance gaps between methods can shift once interference is measured separately from standard forgetting.
  • A forgetting metric that distinguishes entity interference from other sources gives a more accurate picture of method quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Future work could design mitigation strategies that explicitly regularize interactions between new and old entity embeddings rather than treating all new learning uniformly.
  • The same interference pattern may appear in other continual embedding tasks where the set of items or labels expands over time, such as word embeddings or recommendation systems.
  • Benchmarks that grow entities at different rates could be used to quantify how much interference scales with growth speed.

Load-bearing premise

The performance drops seen after new entities arrive are driven primarily by interference from their embeddings rather than by other unmeasured factors, and the corrected protocol isolates this effect without introducing fresh biases.

What would settle it

Re-running the benchmarks with the corrected protocol and finding that performance estimates for existing CKGE methods change by far less than 25 percent, or that the interference effect disappears, would show the claim does not hold.

Figures

Figures reproduced from arXiv: 2604.19401 by Anna Queralt, Besim Bilalli, Carlos Escolano, Gerard Pons.

Figure 1
Figure 1. Figure 1: UMAP representation of the embedding space of the ENTITY dataset [3] for the entities introduced in the first (blue) and second (red) snapshots. this phenomenon for a widely used CKGE benchmark dataset [3], where the embeddings for previously learned entities (blue) and for the newly introduced ones (red) are in the same regions of the embedding space. This is an expected behavior, since an embedding model… view at source ↗
Figure 2
Figure 2. Figure 2: In (a) an initial correct TransE Link Prediction example. In (b), after a continual learning process the embeddings have shifted and the result of link prediction is incorrect. In (c), a new entity interferes with the previously learned knowledge and causes an incorrect prediction. data distribution changes over time while the task remains unchanged; class-incremental learning, where the model incrementall… view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of where different catastrophic forgetting mitigation techniques are implemented in the learning cycle of CKGE. Replay This strategy consists in allocating a memory buffer to store triples from past snapshots (i.e., episodic memory), which are then combined with new triples during training [13, 22, 23]. By replaying past triples, the objective is to preserve previously learned representations … view at source ↗
read the original abstract

Knowledge Graph Embeddings (KGEs) support a wide range of downstream tasks over Knowledge Graphs (KGs). In practice, KGs evolve as new entities and facts are added, motivating Continual Knowledge Graph Embedding (CKGE) methods that update embeddings over time. Current CKGE approaches address catastrophic forgetting (i.e., the performance degradation on previously learned tasks) primarily by limiting changes to existing embeddings. However, we show that this view is incomplete. When new entities are introduced, their embeddings can interfere with previously learned ones, causing the model to predict them in place of previously correct answers. This phenomenon, which we call entity interference, has been largely overlooked and is not accounted for in current CKGE evaluation protocols. As a result, the assessment of catastrophic forgetting becomes misleading, and CKGE methods performance is systematically overestimated. To address this issue, we introduce a corrected CKGE evaluation protocol that accounts for entity interference. Through experiments on multiple benchmarks, we show that ignoring this effect can lead to performance overestimation of up to 25%, particularly in scenarios with significant entity growth. We further analyze how different CKGE methods and KGE models are affected by the different sources of forgetting, and introduce a catastrophic forgetting metric tailored to CKGE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that current CKGE methods and evaluations focus narrowly on catastrophic forgetting via embedding changes to existing entities, but overlook 'entity interference' in which new-entity embeddings degrade predictions for old entities. This leads to systematic overestimation of CKGE performance (up to 25%) under standard link-prediction protocols, especially with entity growth. The authors introduce a corrected evaluation protocol that accounts for this interference, a CKGE-specific forgetting metric, and empirical results across multiple benchmarks showing how different methods and base KGE models are affected by the two sources of forgetting.

Significance. If the central claim and protocol hold, the work would meaningfully advance continual learning for knowledge graphs by identifying an additional, previously unmeasured source of performance degradation distinct from standard forgetting. The empirical scope across benchmarks and the introduction of a tailored metric are positive; reproducible code or parameter-free derivations are not mentioned.

major comments (2)
  1. [Abstract / Evaluation Protocol] Abstract and Evaluation section: the claim that the corrected protocol isolates entity interference (distinct from standard catastrophic forgetting) is load-bearing for the 25% overestimation result, yet the manuscript provides no explicit description of how the protocol modifies rankings or candidate sets to avoid conflating interference with metric artifacts from larger entity pools and shifted answer distributions in growing KGs.
  2. [Experiments] Experiments section: no ablation is described that demonstrates the performance gap vanishes under fixed-entity controls (no new entities) but reappears only when new-entity embeddings are present; without this, it remains possible that observed differences arise from test-distribution shifts rather than embedding-space interference.
minor comments (2)
  1. [Abstract] The abstract states experiments on 'multiple benchmarks' but gives no dataset names, sizes, or growth rates, making it difficult to assess the scope of the entity-growth scenarios.
  2. [Metric Definition] Notation for the new forgetting metric is introduced without an explicit equation or comparison to existing CKGE forgetting measures (e.g., average accuracy drop).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify key aspects of our evaluation protocol and experimental validation. We address each major comment point by point below and will incorporate the suggested clarifications and additions in the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract / Evaluation Protocol] Abstract and Evaluation section: the claim that the corrected protocol isolates entity interference (distinct from standard catastrophic forgetting) is load-bearing for the 25% overestimation result, yet the manuscript provides no explicit description of how the protocol modifies rankings or candidate sets to avoid conflating interference with metric artifacts from larger entity pools and shifted answer distributions in growing KGs.

    Authors: We agree that the current description of the corrected protocol in the Evaluation section could be more explicit to fully substantiate the isolation of entity interference. In the revised manuscript, we will add a dedicated subsection that precisely details how the protocol modifies candidate sets and rankings—specifically by restricting or reweighting predictions to exclude interference from newly introduced entities while preserving the original entity pool for old-entity evaluation. This will distinguish the interference effect from artifacts due to larger entity pools or shifted answer distributions, thereby strengthening the attribution of the up to 25% overestimation to entity interference rather than evaluation mechanics. revision: yes

  2. Referee: [Experiments] Experiments section: no ablation is described that demonstrates the performance gap vanishes under fixed-entity controls (no new entities) but reappears only when new-entity embeddings are present; without this, it remains possible that observed differences arise from test-distribution shifts rather than embedding-space interference.

    Authors: We acknowledge the value of this ablation for ruling out alternative explanations. While our experiments already compare CKGE performance across benchmarks with varying entity growth rates, we did not include an explicit fixed-entity control condition. In the revision, we will add a new ablation experiment that holds the entity set fixed (no new entities introduced) and demonstrates that performance gaps are negligible in this case but re-emerge when new-entity embeddings are allowed to interfere. This will provide direct evidence that the observed differences arise from embedding-space interference rather than test-distribution shifts alone. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical protocol and metric are independently motivated

full rationale

The paper is an empirical study that identifies entity interference via benchmark experiments, proposes a corrected evaluation protocol to account for it, and reports up to 25% overestimation when ignored. No mathematical derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing steps exist in the provided text. The central claims rest on observable performance gaps across multiple KGs and methods rather than reducing to definitions or prior self-citations by construction. The evaluation protocol is presented as a practical correction derived from the observed phenomenon, not as a tautological re-expression of inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The central claim rests on empirical demonstration of entity interference in CKGE benchmarks; no mathematical derivations, free parameters, or background axioms are invoked in the abstract.

invented entities (1)
  • entity interference no independent evidence
    purpose: Describes the disruption where new entity embeddings cause incorrect predictions for previously learned facts in CKGE
    Newly introduced concept to explain an overlooked source of performance issues separate from standard catastrophic forgetting.

pith-pipeline@v0.9.0 · 5529 in / 1077 out tokens · 39206 ms · 2026-05-10T03:14:34.305490+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

  1. [1]

    Freebase: a collaboratively created graph database for structuring human knowledge, in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp

    Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J., 2008. Freebase: a collaboratively created graph database for structuring human knowledge, in: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pp. 1247–1250

  2. [2]

    Translating embeddings for modeling multi-relational data

    Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O., 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26

  3. [3]

    Lifelongembeddinglearningandtransferforgrowingknowledgegraphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp

    Cui,Y.,Wang,Y.,Sun,Z.,Liu,W.,Jiang,Y.,Han,K.,Hu,W.,2023. Lifelongembeddinglearningandtransferforgrowingknowledgegraphs, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 4217–4224

  4. [4]

    Toruse: Knowledge graph embedding on a lie group, in: Proceedings of the AAAI conference on artificial intelligence

    Ebisu, T., Ichise, R., 2018. Toruse: Knowledge graph embedding on a lie group, in: Proceedings of the AAAI conference on artificial intelligence

  5. [5]

    Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world

    Grossberg, S., 2013. Adaptive resonance theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural networks 37, 1–47

  6. [6]

    Simple embedding for link prediction in knowledge graphs

    Kazemi, S.M., Poole, D., 2018. Simple embedding for link prediction in knowledge graphs. Advances in neural information processing systems 31

  7. [7]

    Overcoming catastrophic forgetting in neural networks

    Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A.A., Milan, K., Quan, J., Ramalho, T., Grabska-Barwinska, A., et al., 2017. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 3521–3526

  8. [8]

    Ahybridlearningapproachforcontinualknowledgegraphembedding:Contrastivemasking and joint anti-forgetting, in: International Conference on Artificial Neural Networks, Springer

    Lai,N.,Jin,K.,Long,Y.,Yu,W.,Huang,J.,2025. Ahybridlearningapproachforcontinualknowledgegraphembedding:Contrastivemasking and joint anti-forgetting, in: International Conference on Artificial Neural Networks, Springer. pp. 77–88

  9. [9]

    Sage: Scale-aware gradual evolution for continual knowledge graph embedding, in: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V

    Li, Y., Zhang, L., Yan, H., Zhao, T., Ma, Z., Huang, M., Liu, J., 2025. Sage: Scale-aware gradual evolution for continual knowledge graph embedding, in: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2, pp. 1600–1611

  10. [10]

    Learning entity and relation embeddings for knowledge graph completion, in: Proceedings of the AAAI conference on artificial intelligence

    Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X., 2015. Learning entity and relation embeddings for knowledge graph completion, in: Proceedings of the AAAI conference on artificial intelligence

  11. [11]

    Towards continual knowledge graph embedding via incremental distillation, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp

    Liu, J., Ke, W., Wang, P., Shang, Z., Gao, J., Li, G., Ji, K., Liu, Y., 2024a. Towards continual knowledge graph embedding via incremental distillation, in: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 8759–8768

  12. [12]

    Fast and continual knowledge graph embedding via incremental lora, in: IJCAI

    Liu, J., Ke, W., Wang, P., Wang, J., Gao, J., Shang, Z., Li, G., Xu, Z., Ji, K., Li, Y., 2024b. Fast and continual knowledge graph embedding via incremental lora, in: IJCAI

  13. [13]

    Gradientepisodicmemoryforcontinuallearning

    Lopez-Paz,D.,Ranzato,M.,2017. Gradientepisodicmemoryforcontinuallearning. Advancesinneuralinformationprocessingsystems30. G. Pons et al.:Preprint submitted to ElsevierPage 15 of 18 Revisiting Catastrophic Forgetting in Continual Knowledge Graph Embedding

  14. [14]

    How does knowledge evolve in open knowledge graphs? Transactions on Graph Data and Knowledge 1, 11–1

    Polleres, A., Pernisch, R., Bonifati, A., Dell’Aglio, D., Dobriy, D., Dumbrava, S., Etcheverry, L., Ferranti, N., Hose, K., Jiménez-Ruiz, E., et al., 2023. How does knowledge evolve in open knowledge graphs? Transactions on Graph Data and Knowledge 1, 11–1

  15. [15]

    Acontinualknowledgegraphembeddingmethodbasedonlocal-globaldistillation, in: 2025 8th International Symposium on Big Data and Applied Statistics (ISBDAS), IEEE

    Shi,X.,Mu,C.,Tian,L.,Yan,B.,Xiao,W.,Wang,J.,2025. Acontinualknowledgegraphembeddingmethodbasedonlocal-globaldistillation, in: 2025 8th International Symposium on Big Data and Applied Statistics (ISBDAS), IEEE. pp. 798–802

  16. [16]

    Song, A., Chen, Y., Wang, Y., Zhong, S., Xu, M., 2024. Orchestrating plasticity and stability: A continual knowledge graph embedding framework with bio-inspired dual-mask mechanism, in: The 16th Asian Conference on Machine Learning (Conference Track)

  17. [17]

    Rotate: Knowledge graph embedding by relational rotation in complex space, in: International Conference on Learning Representations

    Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J., 2019. Rotate: Knowledge graph embedding by relational rotation in complex space, in: International Conference on Learning Representations

  18. [18]

    Observed versus latent features for knowledge base and text inference, in: Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pp

    Toutanova, K., Chen, D., 2015. Observed versus latent features for knowledge base and text inference, in: Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pp. 57–66

  19. [19]

    Knowledge graph completion via complex tensor factorization

    Trouillon, T., Dance, C.R., Gaussier, É., Welbl, J., Riedel, S., Bouchard, G., 2017. Knowledge graph completion via complex tensor factorization. Journal of Machine Learning Research 18, 1–38

  20. [20]

    van de Ven, Tinne Tuytelaars, and Andreas S

    van de Ven, G.M., Tuytelaars, T., Tolias, A.S., 2022. Three types of incremental learning. Nature Machine Intelligence 4, 1185–1197. URL: https://doi.org/10.1038/s42256-022-00568-3, doi:10.1038/s42256-022-00568-3

  21. [21]

    Wikidata: a free collaborative knowledgebase

    Vrandečić, D., Krötzsch, M., 2014. Wikidata: a free collaborative knowledgebase. Communications of the ACM 57, 78–85

  22. [22]

    Sentence embedding alignment for lifelong relation extraction, in: Burstein, J., Doran, C., Solorio, T

    Wang, H., Xiong, W., Yu, M., Guo, X., Chang, S., Wang, W.Y., 2019. Sentence embedding alignment for lifelong relation extraction, in: Burstein, J., Doran, C., Solorio, T. (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), ...

  23. [23]

    Wang, L., Liu, Y., Lu, J., Guo, X., Li, Z., Huang, N., 2026. Taming non-stationary knowledge growth: Dynamic global memory framework for lifelong knowledge graph embedding, in: Proceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pp. 714–723

  24. [24]

    Acomprehensivesurveyofcontinuallearning:Theory,methodandapplication

    Wang,L.,Zhang,X.,Su,H.,Zhu,J.,2024. Acomprehensivesurveyofcontinuallearning:Theory,methodandapplication. IEEEtransactions on pattern analysis and machine intelligence 46, 5362–5383

  25. [25]

    Knowledge graph embedding: A survey of approaches and applications

    Wang, Q., Mao, Z., Wang, B., Guo, L., 2017. Knowledge graph embedding: A survey of approaches and applications. IEEE transactions on knowledge and data engineering 29, 2724–2743

  26. [26]

    Stckge:Continualknowledgegraphembeddingbasedonspatial transformation

    Wang,X.,Liu,J.,Xie,K.,Wang,M.,Bi,C.,Deng,J.,Ji,D.,Pan,J.Z.,2025. Stckge:Continualknowledgegraphembeddingbasedonspatial transformation. Knowledge-Based Systems 329, 114337. doi:https://doi.org/10.1016/j.knosys.2025.114337

  27. [27]

    Knowledge graph embedding by translating on hyperplanes, in: Proceedings of the AAAI conference on artificial intelligence

    Wang, Z., Zhang, J., Feng, J., Chen, Z., 2014. Knowledge graph embedding by translating on hyperplanes, in: Proceedings of the AAAI conference on artificial intelligence

  28. [28]

    Embedding entities and relations for learning and inference in knowledge bases, in: Bengio, Y., LeCun, Y

    Yang, B., Yih, W., He, X., Gao, J., Deng, L., 2015. Embedding entities and relations for learning and inference in knowledge bases, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings

  29. [29]

    Yang,J.,Jiang,X.,Jiang,X.,Gao,Y.,Yang,L.T.,Zou,S.,Yang,S.,2025. Fromknowledgeforgettingtoaccumulation:Evolutionaryrelation path passing for lifelong knowledge graph embedding, in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1197–1206

  30. [30]

    Continual learning through synaptic intelligence, in: International conference on machine learning, PMLR

    Zenke, F., Poole, B., Ganguli, S., 2017. Continual learning through synaptic intelligence, in: International conference on machine learning, PMLR. pp. 3987–3995

  31. [31]

    Quaternion knowledge graph embeddings

    Zhang, S., Tay, Y., Yao, L., Liu, Q., 2019. Quaternion knowledge graph embeddings. Advances in neural information processing systems 32

  32. [32]

    Zhao, T., Chen, J., Ru, Y., Lin, Q., Geng, Y., Zhu, H., Pan, Y., Liu, J., 2025. Rethinking continual knowledge graph embedding: Benchmarks and analysis, in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–147

  33. [33]

    Zhu, J., Fu, B., Duan, G., 2025a. Debiasedkge: Towards mitigating spurious forgetting in continual knowledge graph embedding, in: Proceedings of the 34th ACM International Conference on Information and Knowledge Management, pp. 4519–4528

  34. [34]

    Zhu, L., Jeon, D.H., Sun, W., Yang, L., Xie, Y., Niu, S., 2024. Flexible memory rotation (fmr): Rotated representation with dynamic regularization to overcome catastrophic forgetting in continual knowledge graph learning, in: 2024 IEEE International Conference on Big Data (BigData), IEEE. pp. 6180–6189

  35. [35]

    Ett-ckge:Efficienttask-driventokens for continual knowledge graph embedding, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer

    Zhu,L.,Lan,Q.,Tian,Q.,Sun,W.,Yang,L.,Xia,L.,Xie,Y.,Xiao,X.,Duan,T.,Tao,C.,etal.,2025b. Ett-ckge:Efficienttask-driventokens for continual knowledge graph embedding, in: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer. pp. 481–496. G. Pons et al.:Preprint submitted to ElsevierPage 16 of 18 Revisiting Catastrophic...