pith. machine review for the scientific record. sign in

arxiv: 2604.02778 · v1 · submitted 2026-04-03 · 💻 cs.CL

Recognition: no theorem link

When Modalities Remember: Continual Learning for Multimodal Knowledge Graphs

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:25 UTC · model grok-4.3

classification 💻 cs.CL
keywords continual learningmultimodal knowledge graphscatastrophic forgettingknowledge preservationcurriculum learningreplay mechanismknowledge graph reasoningmultimodal reasoning
0
0 comments X

The pith

MRCKG enables continual learning in multimodal knowledge graphs by preserving prior knowledge while acquiring new multimodal facts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Real-world multimodal knowledge graphs change over time as new entities, relations, and associated images or text appear. Prior continual methods handle only structural triples and ignore multimodal signals, while multimodal methods assume a fixed graph and lose earlier information. The paper presents MRCKG, which orders new triples for learning according to their structural ties and multimodal fit, stabilizes representations across modalities to limit forgetting, and replays selected historical samples through contrastive alignment. Experiments on benchmarks built from standard datasets show that the approach retains performance on old knowledge and improves accuracy on newly added multimodal triples.

Core claim

MRCKG is a model for continual multimodal knowledge graph reasoning that employs a multimodal-structural collaborative curriculum to schedule progressive learning based on connectivity and compatibility, introduces cross-modal preservation to keep entity representations stable, relational semantics consistent, and modalities anchored, and applies multimodal contrastive replay with importance sampling and two-stage optimization to reinforce learned knowledge.

What carries the argument

The multimodal-structural collaborative curriculum that orders new triples by structural connectivity to the existing graph and multimodal compatibility, together with cross-modal knowledge preservation and multimodal contrastive replay.

If this is right

  • New multimodal triples can be incorporated without full retraining of the graph model.
  • Entity and relation representations remain usable for reasoning on both old and new data.
  • Multimodal signals from images or text attached to new entities improve learning without overwriting prior structural knowledge.
  • Memory-efficient replay allows the model to retain performance with limited storage of historical examples.
  • Overall reasoning quality on evolving graphs increases relative to methods that treat each snapshot independently.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar curriculum and preservation steps could be tested on other dynamic multimodal systems such as video captioning or social media entity tracking.
  • The method's reliance on constructed benchmarks leaves open whether the same gains appear on continuously collected real-world data streams.
  • The two-stage optimization might be adapted to decide automatically which parts of the graph deserve stronger anchoring versus updating.
  • Integration with external retrieval modules could further reduce the forgetting risk when new modalities arrive.

Load-bearing premise

Benchmarks formed by partitioning existing static multimodal knowledge graph datasets into sequential arrival orders accurately capture the distribution and difficulty of knowledge that emerges in real evolving graphs.

What would settle it

A measurable drop in accuracy on previously learned multimodal triples after training on new arrivals in a genuinely streaming multimodal knowledge graph dataset would show that the preservation and replay mechanisms fail to prevent forgetting.

Figures

Figures reproduced from arXiv: 2604.02778 by Dongming Jin, Gadeng Luosang, Haoran Duan, Linyu Li, Nyima Tashi, Yichi Zhang, Yuanpeng He, Zhi Jin.

Figure 1
Figure 1. Figure 1: CMMKG stores knowledge in the form of triplets; [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall framework of MRCKG for continual multimodal knowledge graph reasoning. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Analysis on DB15K-Entity. (a) Per-snapshot MRR [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: (a) Error type distribution of MRCKG; (b) Hits@1 [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

Real-world multimodal knowledge graphs (MMKGs) are dynamic, with new entities, relations, and multimodal knowledge emerging over time. Existing continual knowledge graph reasoning (CKGR) methods focus on structural triples and cannot fully exploit multimodal signals from new entities. Existing multimodal knowledge graph reasoning (MMKGR) methods, however, usually assume static graphs and suffer catastrophic forgetting as graphs evolve. To address this gap, we present a systematic study of continual multimodal knowledge graph reasoning (CMMKGR). We construct several continual multimodal knowledge graph benchmarks from existing MMKG datasets and propose MRCKG, a new CMMKGR model. Specifically, MRCKG employs a multimodal-structural collaborative curriculum to schedule progressive learning based on the structural connectivity of new triples to the historical graph and their multimodal compatibility. It also introduces a cross-modal knowledge preservation mechanism to mitigate forgetting through entity representation stability, relational semantic consistency, and modality anchoring. In addition, a multimodal contrastive replay scheme with a two-stage optimization strategy reinforces learned knowledge via multimodal importance sampling and representation alignment. Experiments on multiple datasets show that MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript addresses continual multimodal knowledge graph reasoning (CMMKGR), a gap between existing continual KG methods (limited to structural triples) and static multimodal KG methods (prone to catastrophic forgetting). It constructs several continual benchmarks by adapting existing static MMKG datasets, and proposes MRCKG, which uses a multimodal-structural collaborative curriculum to schedule learning based on structural connectivity and multimodal compatibility, a cross-modal preservation mechanism (entity stability, relational consistency, modality anchoring), and a multimodal contrastive replay scheme with two-stage optimization and importance sampling. Experiments are reported to show that MRCKG preserves prior multimodal knowledge while improving acquisition of new knowledge.

Significance. If the proposed curriculum, preservation, and replay mechanisms can be shown to work under realistic temporal evolution rather than post-hoc splits, the work would provide a useful foundation for dynamic multimodal knowledge systems. The conceptual separation of structural and multimodal signals in the curriculum is a clear strength, and the explicit focus on cross-modal consistency offers a concrete direction for future continual multimodal models. However, the significance is currently limited by the lack of detail on benchmark realism and experimental controls.

major comments (2)
  1. [Abstract and Experiments] Abstract and Experiments section: The central claim that 'MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge' is supported only by high-level experimental statements. No baselines, exact metrics (e.g., MRR, Hits@K), statistical significance tests, error bars, or ablation controls are described, making it impossible to evaluate whether the reported gains are robust or merely artifacts of the evaluation protocol.
  2. [Benchmark construction] Benchmark construction (Abstract and §4): The continual multimodal benchmarks are obtained by splitting existing static MMKG datasets, yet no information is given on task sequencing, arrival order of new entities/relations, entity overlap statistics between tasks, or the degree of multimodal distribution shift. This construction detail is load-bearing for the headline claim; without it, the observed stability and improvement cannot be distinguished from properties of the chosen splits.
minor comments (1)
  1. [Abstract] The abstract would be clearer if it named the specific source MMKG datasets and reported at least one quantitative improvement (e.g., average MRR gain) rather than the qualitative phrase 'substantially improving'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that greater specificity in the abstract and benchmark construction section will improve clarity and allow readers to better assess the results. We address each major comment below and will incorporate the necessary revisions.

read point-by-point responses
  1. Referee: [Abstract and Experiments] Abstract and Experiments section: The central claim that 'MRCKG preserves previously learned multimodal knowledge while substantially improving the learning of new knowledge' is supported only by high-level experimental statements. No baselines, exact metrics (e.g., MRR, Hits@K), statistical significance tests, error bars, or ablation controls are described, making it impossible to evaluate whether the reported gains are robust or merely artifacts of the evaluation protocol.

    Authors: We appreciate the referee highlighting the need for explicit detail. The Experiments section (§5) already contains full baseline comparisons (both structural CKGR and static MMKGR methods), exact metrics (MRR, Hits@1/3/10), error bars over five random seeds, paired t-test significance results, and ablations isolating the curriculum, cross-modal preservation, and contrastive replay components. To address the concern directly, we will revise the abstract to include key quantitative results and a brief statement of the evaluation protocol. We will also add a consolidated results table at the start of §5 for immediate visibility. revision: partial

  2. Referee: [Benchmark construction] Benchmark construction (Abstract and §4): The continual multimodal benchmarks are obtained by splitting existing static MMKG datasets, yet no information is given on task sequencing, arrival order of new entities/relations, entity overlap statistics between tasks, or the degree of multimodal distribution shift. This construction detail is load-bearing for the headline claim; without it, the observed stability and improvement cannot be distinguished from properties of the chosen splits.

    Authors: We agree these construction details are essential. In the revised §4 we will add: (i) explicit task sequencing (tasks ordered by increasing structural connectivity of new triples), (ii) arrival order of entities/relations (sorted by degree in the growing graph), (iii) overlap statistics (tables reporting 12–28 % entity overlap and <10 % relation overlap across tasks), and (iv) multimodal shift quantification (KL divergence and Wasserstein distance on image and text embeddings between consecutive tasks). These additions will demonstrate that the splits simulate realistic incremental evolution rather than trivial reuse of prior knowledge. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on independent benchmark evaluation

full rationale

The paper introduces MRCKG with curriculum scheduling, preservation mechanisms, and replay schemes, then evaluates on benchmarks constructed from static MMKG datasets. No equations, fitted parameters, or self-citations are shown that reduce the reported preservation/improvement results to quantities defined by the model itself. The derivation chain is self-contained: the model components are described as novel contributions, and performance is measured against external task sequences rather than by construction from inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that the three proposed mechanisms can be combined without destructive interference and that the constructed benchmarks are representative; no free parameters, axioms, or invented entities are explicitly listed in the abstract.

axioms (2)
  • domain assumption The structural connectivity and multimodal compatibility of new triples provide a reliable ordering signal for curriculum learning.
    Invoked when the paper states the curriculum schedules progressive learning based on these two factors.
  • domain assumption Entity representation stability, relational semantic consistency, and modality anchoring are sufficient to mitigate catastrophic forgetting in multimodal settings.
    Invoked in the description of the cross-modal knowledge preservation mechanism.

pith-pipeline@v0.9.0 · 5524 in / 1334 out tokens · 38659 ms · 2026-05-13T20:25:08.179160+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PrimeKG-CL: A Continual Graph Learning Benchmark on Evolving Biomedical Knowledge Graphs

    cs.AI 2026-05 conditional novelty 8.0

    PrimeKG-CL supplies the first continual graph learning benchmark using authentic temporal snapshots from nine biomedical databases, showing strong interactions between embedding decoders and learning strategies plus l...

  2. CMKL: Modality-Aware Continual Learning for Evolving Biomedical Knowledge Graphs

    cs.LG 2026-05 conditional novelty 6.0

    CMKL delivers a 60% gain in average precision on continual entity classification in a 129K-entity biomedical KG benchmark by fusing multimodal features and protecting against modality-specific forgetting, while relati...

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · cited by 2 Pith papers · 3 internal anchors

  1. [1]

    Ivana Balažević, Carl Allen, and Timothy Hospedales. 2019. Tucker: Tensor fac- torization for knowledge graph completion. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 5185–5194

  2. [2]

    Hangbo Bao, Li Dong, Songhao Piao, and Furu Wei. 2021. Beit: Bert pre-training of image transformers.arXiv preprint arXiv:2106.08254(2021)

  3. [3]

    Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Ok- sana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data.Advances in neural information processing systems26 (2013)

  4. [4]

    Xianshuai Cao, Yuliang Shi, Jihu Wang, Han Yu, Xinjun Wang, and Zhongmin Yan

  5. [5]

    InProceedings of the 30th ACM international conference on multimedia

    Cross-modal knowledge graph contrastive learning for machine learning method recommendation. InProceedings of the 30th ACM international conference on multimedia. 3694–3702

  6. [6]

    Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, and Qingming Huang. 2022. Otkge: Multi-modal knowledge graph embeddings via optimal transport.Advances in neural information processing systems35 (2022), 39090–39102

  7. [7]

    Xiang Chen, Ningyu Zhang, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, and Huajun Chen. 2022. Hybrid transformer with multi-level fusion for multimodal knowledge graph completion. InProceedings of the 45th international ACM SIGIR conference on research and development in information retrieval. 904–915

  8. [8]

    Zhuo Chen, Yichi Zhang, Yin Fang, Yuxia Geng, Lingbing Guo, Xiang Chen, Qian Li, Wen Zhang, Jiaoyan Chen, Yushan Zhu, et al. 2024. Knowledge graphs meet multi-modal learning: A comprehensive survey.arXiv preprint arXiv:2402.05391 (2024)

  9. [9]

    Yuanning Cui, Yuxin Wang, Zequn Sun, Wenqiang Liu, Yiqiao Jiang, Kexin Han, and Wei Hu. 2023. Lifelong embedding learning and transfer for growing knowl- edge graphs. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4217–4224

  10. [10]

    Jacob Devlin. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint arXiv:1810.04805(2018)

  11. [11]

    Haichuan Fang, Haoran Zhang, Yulin Du, Qiang Guo, Zhen Tian, Youwei Wang, and Yangdong Ye. 2025. CDIB: Consistency Discovery-guided Information Bot- tleneck for Multi-modal Knowledge Graph Reasoning. InProceedings of the 33rd ACM International Conference on Multimedia. 1062–1071

  12. [12]

    Yue Jian, Xiangyu Luo, Zhifei Li, Miao Zhang, Yan Zhang, Kui Xiao, and Xiaoju Hou. 2025. Apkgc: Noise-enhanced multi-modal knowledge graph completion with attention penalty. InProceedings of the AAAI conference on artificial intelli- gence, Vol. 39. 15005–15013

  13. [13]

    Xiaowen Jiang, Jing Yang, ShunDong Yang, Yuan Gao, Xinfa Jiang, Laurence Tian- ruo Yang, and Jieming Yang. 2026. Towards Multimodal Continual Knowledge Embedding with Modality Forgetting Modulation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 40. 14946–14954

  14. [14]

    Juyeon Kim, Geon Lee, Taeuk Kim, and Kijung Shin. 2025. KGMEL: Knowl- edge Graph-Enhanced Multimodal Entity Linking. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3015–3019

  15. [15]

    James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. 2017. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences114, 13 (2017), 3521– 3526

  16. [16]

    Xiaoyu Kou, Yankai Lin, Shaobo Liu, Peng Li, Jie Zhou, and Yan Zhang. 2020. Disentangle-based continual graph representation learning. InProceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). 2961–2972

  17. [17]

    Jaejun Lee, Chanyoung Chung, Hochang Lee, Sungho Jo, and Joyce Whang. 2023. Vista: Visual-textual knowledge graph representation learning. InFindings of the association for computational linguistics: EMNLP 2023. 7314–7328

  18. [18]

    Junlin Lee, Yequan Wang, Jing Li, and Min Zhang. 2024. Multimodal reasoning with multimodal knowledge graph. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 10767– 10782

  19. [19]

    Guoyi Li, Die Hu, Xiaomeng Fu, Qirui Tang, Yulei Wu, Xiaodan Zhang, and Honglei Lyu. 2025. Entity Graph Alignment and Visual Reasoning for Multimodal Fake News Detection. InProceedings of the 33rd ACM International Conference on Multimedia. 2486–2495

  20. [20]

    Linyu Li, Zhi Jin, Yuanpeng He, Dongming Jin, Haoran Duan, Zhengwei Tao, Xuan Zhang, and Jiandong Li. 2025. Rethinking regularization methods for knowledge graph completion.arXiv preprint arXiv:2505.23442(2025)

  21. [21]

    Linyu Li, Zhi Jin, Yuanpeng He, Dongming Jin, Yichi Zhang, Haoran Duan, Xuan Zhang, Zhengwei Tao, and Nyima Tash. 2025. Learning to Evolve: Bayesian- Guided Continual Knowledge Graph Embedding.arXiv preprint arXiv:2508.02426 (2025)

  22. [22]

    Linyu Li, Zhi Jin, Xuan Zhang, Haoran Duan, Jishu Wang, Zhengwei Tao, Haiyan Zhao, and Xiaofeng Zhu. 2025. Multi-view riemannian manifolds fusion enhance- ment for knowledge graph completion.IEEE Transactions on Knowledge and Data Engineering(2025)

  23. [23]

    Linyu Li, Zhi Jin, Yichi Zhang, Dongming Jin, Chengfeng Dou, Yuanpeng He, Xuan Zhang, and Haiyan Zhao. 2025. Towards structure-aware model for multi- modal knowledge graph completion.IEEE Transactions on Multimedia(2025)

  24. [24]

    Qian Li, Siyuan Liang, Yuzheng Zhang, Cheng Ji, Zongyu Chang, and Shangguang Wang. 2025. Meta-Knowledge Path Augmentation for Multi-Hop Reasoning on Satellite Commonsense Multi-Modal Knowledge Graphs. InProceedings of the 33rd ACM International Conference on Multimedia. 7568–7577

  25. [25]

    Ran Li, Shimin Di, Lei Chen, and Xiaofang Zhou. 2024. Simdiff: Simple denoising probabilistic latent diffusion model for data augmentation on multi-modal knowl- edge graph. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1631–1642

  26. [26]

    Xinhang Li, Xiangyu Zhao, Jiaxing Xu, Yong Zhang, and Chunxiao Xing. 2023. IMF: interactive multimodal fusion model for link prediction. InProceedings of the ACM web conference 2023. 2572–2580

  27. [27]

    Yifei Li, Lingling Zhang, Hang Yan, Tianzhe Zhao, Zihan Ma, Muye Huang, and Jun Liu. 2025. SAGE: Scale-Aware Gradual Evolution for Continual Knowl- edge Graph Embedding. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 1600–1611

  28. [28]

    Zhuofeng Li, Haoxiang Zhang, Qiannan Zhang, Ziyi Kou, and Shichao Pei. 2024. Learning from novel knowledge: Continual few-shot knowledge graph comple- tion. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. 1326–1335

  29. [29]

    Ke Liang, Lingyuan Meng, Meng Liu, Yue Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, Xinwang Liu, Fuchun Sun, and Kunlun He. 2024. A survey of knowledge graph reasoning on graph types: Static, dynamic, and multi-modal.IEEE Trans- actions on Pattern Analysis and Machine Intelligence46, 12 (2024), 9456–9478

  30. [30]

    Ke Liang, Lingyuan Meng, Yue Liu, Meng Liu, Wei Wei, Suyuan Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, and Xinwang Liu. 2024. Simple yet effective: structure guided pre-trained transformer for multi-modal knowledge graph reasoning. In Proceedings of the 32nd ACM international conference on multimedia. 1554–1563

  31. [31]

    Jiajun Liu, Wenjun Ke, Peng Wang, Ziyu Shang, Jinhua Gao, Guozheng Li, Ke Ji, and Yanhe Liu. 2024. Towards continual knowledge graph embedding via incre- mental distillation. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 8759–8768

  32. [32]

    Jiajun Liu, Wenjun Ke, Peng Wang, Jiahao Wang, Jinhua Gao, Ziyu Shang, Guozheng Li, Zijie Xu, Ke Ji, and Yining Li. 2024. Fast and continual knowledge graph embedding via incremental lora.arXiv preprint arXiv:2407.05705(2024)

  33. [33]

    Kangzheng Liu, Feng Zhao, Yu Yang, and Guandong Xu. 2024. Dysarl: dynamic structure-aware representation learning for multimodal knowledge graph rea- soning. InProceedings of the 32nd ACM International Conference on Multimedia. 8247–8256

  34. [34]

    Ye Liu, Hui Li, Alberto Garcia-Duran, Mathias Niepert, Daniel Onoro-Rubio, and David S Rosenblum. 2019. MMKG: multi-modal knowledge graphs. InThe Semantic Web: 16th International Conference, ESWC 2019, Portorož, Slovenia, June 2–6, 2019, Proceedings 16. Springer, 459–474

  35. [35]

    Pengfei Luo, Tong Xu, Che Liu, Suojuan Zhang, Linli Xu, Minglei Li, and Enhong Chen. 2024. Bridging gaps in content and knowledge for multimodal entity linking. InProceedings of the 32nd ACM International Conference on Multimedia. 9311–9320

  36. [36]

    Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in con- nectionist networks: The sequential learning problem. InPsychology of learning and motivation. Vol. 24. Elsevier, 109–165

  37. [37]

    Wenxin Ni, Qianqian Xu, Yangbangyan Jiang, Zongsheng Cao, Xiaochun Cao, and Qingming Huang. 2023. PSNEA: Pseudo-siamese network for entity align- ment between multi-modal knowledge graphs. InProceedings of the 31st ACM international conference on multimedia. 3489–3497. Conference’17, July 2017, Washington, DC, USA Li et al

  38. [38]

    Guanglin Niu and Xiaowei Zhang. 2025. Diffusion-based hierarchical negative sampling for multimodal knowledge graph completion. InInternational Confer- ence on Database Systems for Advanced Applications. Springer, 479–495

  39. [39]

    Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Pro- gressive neural networks.arXiv preprint arXiv:1606.04671(2016)

  40. [40]

    Bin Shang, Yinliang Zhao, Jun Liu, and Di Wang. 2024. LAFA: Multimodal knowl- edge graph completion with link aware fusion and aggregation. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 8957–8965

  41. [41]

    Siyue Su, Jian Yang, Bo Li, and Guanglin Niu. 2026. Tokenization, Fusion and Decoupling: Bridging the Granularity Mismatch Between Large Language Models and Knowledge Graphs.arXiv preprint arXiv:2602.22698(2026)

  42. [42]

    Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. Rotate: Knowl- edge graph embedding by relational rotation in complex space.arXiv preprint arXiv:1902.10197(2019)

  43. [43]

    Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. InInternational conference on machine learning. PMLR, 2071–2080

  44. [44]

    Hong Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019. Sentence embedding alignment for lifelong relation extraction. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 796–806

  45. [45]

    Luyao Wang, Chunlai Zhou, and Biao Qin. 2025. Explicit-Implicit Entity Align- ment Method in Multi-modal Knowledge Graphs. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 2996–3007

  46. [46]

    Meng Wang, Sen Wang, Han Yang, Zheng Zhang, Xi Chen, and Guilin Qi. 2021. Is visual context really helpful for knowledge graph? A representation learning perspective. InProceedings of the 29th ACM international conference on multimedia. 2735–2743

  47. [47]

    Xin Wang, Benyuan Meng, Hong Chen, Yuan Meng, Ke Lv, and Wenwu Zhu. 2023. TIVA-KG: A multimodal knowledge graph with text, image, video and audio. In Proceedings of the 31st ACM international conference on multimedia. 2391–2399

  48. [48]

    Yunpeng Wang, Bo Ning, Xin Wang, Chengfei Liu, and Guanyu Li. 2025. Seg- mentation similarity enhanced semantic related entity fusion for multi-modal knowledge graph completion. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1176–1185

  49. [49]

    Yijun Wang, Siying Wu, Lubin Gan, Zheyu Zhang, Jing Zhang, Zhangchi Hu, Huyue Zhu, Peixi Wu, and Xiaoyan Sun. 2025. MeDKCoOp: Dual Knowledge- guided Graph Prompt Learning for Biomedical Vision-Language Models. In Proceedings of the 33rd ACM International Conference on Multimedia. 3635–3644

  50. [50]

    Yuyang Wei, Wei Chen, Xiaofang Zhang, Pengpeng Zhao, Jianfeng Qu, and Lei Zhao. 2024. Multi-modal Siamese network for few-shot knowledge graph completion. In2024 IEEE 40th International Conference on Data Engineering (ICDE). IEEE, 719–732

  51. [51]

    Di Wu, Wu Sun, Yi He, Zhong Chen, and Xin Luo. 2024. Mkg-fenn: A multimodal knowledge graph fused end-to-end neural network for accurate drug–drug inter- action prediction. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 10216–10224

  52. [52]

    Derong Xu, Tong Xu, Shiwei Wu, Jingbo Zhou, and Enhong Chen. 2022. Relation- enhanced negative sampling for multimodal knowledge graph completion. In Proceedings of the 30th ACM international conference on multimedia. 3857–3866

  53. [53]

    Xiaodi Xu, Lijie Li, Ye Wang, Tao Ren, and Tian Qiao. 2025. WFF: Wavelet- based Information Fusion for Multimodal Knowledge Graph Link Prediction. In Proceedings of the 33rd ACM International Conference on Multimedia. 2084–2093

  54. [54]

    Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2014. Em- bedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575(2014)

  55. [55]

    Jing Yang, Xinfa Jiang, Xiaowen Jiang, Yuan Gao, Laurence T Yang, Shaojun Zou, and Shundong Yang. 2025. From Knowledge Forgetting to Accumulation: Evolutionary Relation Path Passing for Lifelong Knowledge Graph Embedding. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1197–1206

  56. [56]

    Friedemann Zenke, Ben Poole, and Surya Ganguli. 2017. Continual learning through synaptic intelligence. InInternational conference on machine learning. Pmlr, 3987–3995

  57. [57]

    Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, and Huajun Chen. 2024. Native: Multi-modal knowledge graph comple- tion in the wild. InProceedings of the 47th international ACM SIGIR conference on research and development in information retrieval. 91–101

  58. [58]

    Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Binbin Hu, Ziqi Liu, Wen Zhang, and Huajun Chen. 2025. Tokenization, fusion, and augmentation: to- wards fine-grained multi-modal entity representation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 13322–13330

  59. [59]

    Tianzhe Zhao, Jiaoyan Chen, Yanchi Ru, Qika Lin, Yuxia Geng, Haiping Zhu, Yudai Pan, and Jun Liu. 2025. Rethinking continual knowledge graph embedding: Benchmarks and analysis. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 138–147

  60. [60]

    Yu Zhao, Xiangrui Cai, Yike Wu, Haiwei Zhang, Ying Zhang, Guoqing Zhao, and Ning Jiang. 2022. Mose: Modality split and ensemble for multimodal knowledge graph completion. InProceedings of the 2022 conference on empirical methods in natural language processing. 10527–10536

  61. [61]

    Yu Zhao, Ying Zhang, Xuhui Sui, Baohang Zhou, Haoze Zhu, Jeff Z Pan, and Xiaojie Yuan. 2025. Dark Side of Modalities: Reinforced Multimodal Distillation for Multimodal Knowledge Graph Reasoning. InProceedings of the 33rd ACM International Conference on Multimedia. 2506–2515

  62. [62]

    Yu Zhao, Ying Zhang, Baohang Zhou, Xinying Qian, Kehui Song, and Xiangrui Cai

  63. [63]

    InProceedings of the 47th international ACM SIGIR conference on research and development in information retrieval

    Contrast then memorize: Semantic neighbor retrieval-enhanced inductive multimodal knowledge graph completion. InProceedings of the 47th international ACM SIGIR conference on research and development in information retrieval. 102– 111

  64. [64]

    Xiangru Zhu, Zhixu Li, Xiaodan Wang, Xueyao Jiang, Penglei Sun, Xuwu Wang, Yanghua Xiao, and Nicholas Jing Yuan. 2022. Multi-modal knowledge graph construction and application: A survey.IEEE Transactions on Knowledge and Data Engineering36, 2 (2022), 715–735