Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining

Carl Yang; Hang Lv; Pengxiang Zhan; Shiping Wang; Yanchao Tan

arxiv: 2403.04780 · v3 · pith:J7YRM4TMnew · submitted 2024-03-02 · 💻 cs.CL · cs.AI

Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining

Yanchao Tan , Hang Lv , Pengxiang Zhan , Shiping Wang , Carl Yang This is my paper

classification 💻 cs.CL cs.AI

keywords graphtasksdatasetsllmsinstructionlanguageacrossmining

0 comments

read the original abstract

Graphs with abundant attributes are essential in modeling interconnected entities and enhancing predictions across various real-world applications. Traditional Graph Neural Networks (GNNs) often require re-training for different graph tasks and datasets. Although the emergence of Large Language Models (LLMs) has introduced new paradigms in natural language processing, their potential for generic graph mining, training a single model to simultaneously handle diverse tasks and datasets, remains under-explored. To this end, our novel framework MuseGraph, seamlessly integrates the strengths of GNNs and LLMs into one foundation model for graph mining across tasks and datasets. This framework first features a compact graph description to encapsulate key graph information within language token limitations. Then, we propose a diverse instruction generation mechanism with Chain-of-Thought (CoT)-based instruction packages to distill the reasoning capabilities from advanced LLMs like GPT-4. Finally, we design a graph-aware instruction tuning strategy to facilitate mutual enhancement across multiple tasks and datasets while preventing catastrophic forgetting of LLMs' generative abilities. Our experimental results demonstrate significant improvements in five graph tasks and ten datasets, showcasing the potential of our MuseGraph in enhancing the accuracy of graph-oriented downstream tasks while improving the generation abilities of LLMs.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Revisiting Graph-Tokenizing Large Language Models: A Systematic Evaluation of Graph Token Understanding
cs.CL 2026-05 unverdicted novelty 6.0

GTokenLLMs do not fully understand graph tokens, exhibiting over-sensitivity or insensitivity to instruction changes and relying heavily on text for reasoning even when graph information is preserved.