CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs
Pith reviewed 2026-05-17 00:46 UTC · model grok-4.3
The pith
CKG-LLM builds knowledge graphs from smart contract code and uses large language models to turn natural-language vulnerability patterns into executable graph queries that locate access-control flaws.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CKG-LLM leverages the reasoning and code-generation capabilities of large language models to translate natural-language vulnerability patterns into executable queries over contract knowledge graphs, thereby automatically locating vulnerable code elements and achieving superior performance in detecting access-control vulnerabilities compared to existing tools.
What carries the argument
Contract knowledge graph as an intermediate representation of code entities and relations that supports graph query languages to identify rule-violating elements, with LLMs generating the queries from natural language.
If this is right
- Automatic identification of vulnerable code elements becomes possible through graph queries derived from natural-language patterns.
- Detection performance exceeds that of existing tools relying on abstract syntax trees, control-flow graphs, or static single assignment form.
- The framework supports richer intermediate abstractions of contract logic than purely syntactic representations.
- Future extensions can build on the same knowledge-graph-plus-LLM pipeline for additional analysis tasks.
Where Pith is reading between the lines
- The same translation step from natural language to graph queries could apply to vulnerability classes other than access control if suitable patterns are supplied.
- Hybrid systems might combine CKG-LLM queries with conventional static analyzers to increase overall coverage.
- The method suggests that semantic graph representations can serve as a general bridge between human-readable security rules and automated code search.
Load-bearing premise
Knowledge graphs extracted from contract source code encode semantic structures and control logic well enough that LLM-generated queries can reliably flag access-control violations without excessive false positives or missed cases.
What would settle it
A controlled test on a dataset of smart contracts containing documented access-control vulnerabilities where CKG-LLM produces detection rates or false-positive rates no better than the strongest current static-analysis tools.
Figures
read the original abstract
Traditional approaches for smart contract analysis often rely on intermediate representations such as abstract syntax trees, control-flow graphs, or static single assignment form. However, these methods face limitations in capturing both semantic structures and control logic. Knowledge graphs, by contrast, offer a structured representation of entities and relations, enabling richer intermediate abstractions of contract code and supporting the use of graph query languages to identify rule-violating elements. This paper presents CKG-LLM, a framework for detecting access-control vulnerabilities in smart contracts. Leveraging the reasoning and code generation capabilities of large language models, CKG-LLM translates natural-language vulnerability patterns into executable queries over contract knowledge graphs to automatically locate vulnerable code elements. Experimental evaluation demonstrates that CKG-LLM achieves superior performance in detecting access-control vulnerabilities compared to existing tools. Finally, we discuss potential extensions of CKG-LLM as part of future research directions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents CKG-LLM, a framework for detecting access-control vulnerabilities in smart contracts. It builds knowledge graphs from contract code to capture entities and relations beyond what ASTs, CFGs or SSA forms provide, then uses LLMs to translate natural-language vulnerability patterns into executable graph queries that locate rule-violating elements. The central claim is that this pipeline yields superior detection performance relative to existing tools.
Significance. If the knowledge-graph representation and LLM-generated queries prove reliable, the work could offer a more semantic and flexible alternative to rigid static-analysis pipelines for smart-contract security, with potential impact on automated auditing of access-control logic in DeFi and other blockchain applications.
major comments (3)
- [Experimental Evaluation] Experimental Evaluation section: the claim that CKG-LLM 'achieves superior performance' is presented without any reported metrics (precision, recall, F1), dataset size, number of contracts or vulnerabilities tested, baseline tools, or statistical tests. This absence makes the central empirical claim unverifiable.
- [Knowledge Graph Construction] Knowledge Graph Construction (or equivalent methodology section): no schema, entity/relation definitions, or extraction rules are supplied for modeling access-control constructs such as modifier applications, msg.sender checks, or state-variable accesses. Without these, it is impossible to judge whether the KG actually overcomes the stated limitations of AST/CFG/SSA.
- [LLM Query Generation] LLM Query Generation subsection: the paper provides neither a worked example of translating a canonical natural-language pattern (e.g., 'unprotected critical function') into a graph query nor any discussion of query correctness, hallucination mitigation, or false-positive handling. This is load-bearing for the claimed automation advantage.
minor comments (2)
- [Abstract] Abstract: the summary of results is entirely qualitative; inclusion of at least one key quantitative indicator would improve informativeness.
- [Introduction / System Overview] Overall manuscript: a high-level architecture diagram illustrating the KG extraction, LLM query translation, and execution pipeline would aid reader comprehension.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript accordingly to improve clarity, completeness, and verifiability of the claims.
read point-by-point responses
-
Referee: [Experimental Evaluation] Experimental Evaluation section: the claim that CKG-LLM 'achieves superior performance' is presented without any reported metrics (precision, recall, F1), dataset size, number of contracts or vulnerabilities tested, baseline tools, or statistical tests. This absence makes the central empirical claim unverifiable.
Authors: We agree that the Experimental Evaluation section as currently written does not provide the quantitative details necessary to substantiate the performance claim. In the revised manuscript we will expand this section to include precision, recall, and F1 scores; the exact dataset composition (number of contracts and labeled vulnerabilities); the specific baseline tools used for comparison; and any statistical tests performed. These additions will make the empirical results fully verifiable. revision: yes
-
Referee: [Knowledge Graph Construction] Knowledge Graph Construction (or equivalent methodology section): no schema, entity/relation definitions, or extraction rules are supplied for modeling access-control constructs such as modifier applications, msg.sender checks, or state-variable accesses. Without these, it is impossible to judge whether the KG actually overcomes the stated limitations of AST/CFG/SSA.
Authors: The referee is correct that explicit schema information is missing. We will add a new subsection (or appendix) that defines the entity types (e.g., Function, Modifier, StateVariable, MsgSenderCheck), relation types (e.g., appliesModifier, checksSender, accessesState), and the precise extraction rules applied to Solidity source code. This will allow readers to assess how the knowledge graph captures semantic and control-flow information beyond AST, CFG, or SSA representations. revision: yes
-
Referee: [LLM Query Generation] LLM Query Generation subsection: the paper provides neither a worked example of translating a canonical natural-language pattern (e.g., 'unprotected critical function') into a graph query nor any discussion of query correctness, hallucination mitigation, or false-positive handling. This is load-bearing for the claimed automation advantage.
Authors: We acknowledge that a concrete example and discussion of the query-generation process are absent. In the revision we will insert a worked example that shows the step-by-step translation of a natural-language pattern such as 'unprotected critical function' into an executable graph query. We will also add a paragraph addressing query correctness, prompt-engineering techniques used to reduce hallucinations, and post-processing steps for false-positive mitigation. revision: yes
Circularity Check
No circularity: framework is a new construction with experimental claims
full rationale
The paper describes CKG-LLM as a novel framework that builds knowledge graphs from smart contract code and uses LLMs to generate queries for access-control vulnerability detection. No equations, fitted parameters, or quantitative predictions appear in the abstract or description. The central performance claim rests on experimental evaluation against existing tools rather than any self-referential derivation or self-citation chain that reduces the result to its inputs by construction. The approach is presented as an independent construction that overcomes limitations of AST/CFG/SSA representations, with no load-bearing uniqueness theorems or ansatzes imported from prior author work.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Knowledge graphs offer a structured representation of entities and relations that enables richer intermediate abstractions of contract code than traditional representations.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CKG employs a two-layered design... ontology O=(C,A,P) ... instance layer I=(I,L,P) ... Slither IR ... SPARQL queries
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, et al . 2024. A survey on evaluation of large language models.ACM transactions on intelligent systems and technology15, 3 (2024), 1–45
work page 2024
-
[2]
Jiachi Chen, Zhenzhe Shao, Shuo Yang, Yiming Shen, Yanlin Wang, Ting Chen, Zhenyu Shan, and Zibin Zheng. 2025. NumScout: Unveiling Numerical Defects in Smart Contracts using LLM-Pruning Symbolic Execution.IEEE Transactions on Software Engineering(2025)
work page 2025
- [3]
-
[4]
Mohnish Dubey. 2025. A large data set of natural language queries with cor- responding SPARQL queries for Wikidata and Dbpedia2018. Website. https: //github.com/AskNowQA/LC-QuAD2.0
work page 2025
-
[5]
Josselin Feist, Gustavo Grieco, and Alex Groce. 2019. Slither: a static analysis framework for smart contracts. In2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). IEEE, 8–15
work page 2019
- [6]
-
[7]
Asem Ghaleb, Julia Rubin, and Karthik Pattabiraman. 2023. Achecker: Statically detecting smart contract access control vulnerabilities. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 945–956
work page 2023
-
[8]
Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d’Amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, et al. 2021. Knowledge graphs.ACM Computing Surveys (Csur)54, 4 (2021), 1–37
work page 2021
-
[9]
Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. 2022. Lora: Low-rank adaptation of large language models.ICLR1, 2 (2022), 3
work page 2022
-
[10]
Tianyuan Hu, Bixin Li, Zhenyu Pan, and Chen Qian. 2023. Detect defects of solidity smart contract based on the knowledge graph.IEEE Transactions on Reliability73, 1 (2023), 186–202
work page 2023
- [11]
- [12]
-
[13]
Xiaoqi Li, Wenkai Li, Zhiquan Liu, Yuqing Zhang, and Yingjie Mao. 2025. Pene- trating the Hostile: Detecting DeFi Protocol Exploits Through Cross-Contract Analysis.IEEE Transactions on Information Forensics and Security20 (2025), 11759–11774
work page 2025
- [14]
- [15]
-
[16]
Yuanyuan Liang, Keren Tan, Tingyu Xie, Wenbiao Tao, Siyuan Wang, Yunshi Lan, and Weining Qian. 2024. Aligning large language models to a domain-specific graph database for nl2gql. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. 1367–1377
work page 2024
- [17]
- [18]
-
[19]
Meta. 2025. Introducing Llama 3.1: Our most capable models to date. Website. https://ai.meta.com/blog/meta-llama-3-1/
work page 2025
-
[20]
MITRE. 2025. CWE -269: Improper Privilege Management. Website. https: //cwe.mitre.org/data/definitions/269.html
work page 2025
-
[21]
MITRE. 2025. CWE -276: Incorrect Default Permissions. Website. https://cwe. mitre.org/data/definitions/276.html
work page 2025
-
[22]
MITRE. 2025. CWE -284: Improper Access Control. Website. https://cwe.mitre. org/data/definitions/284.html
work page 2025
-
[23]
MITRE. 2025. CWE -285: Improper Authorization. Website. https://cwe.mitre. org/data/definitions/285.html
work page 2025
-
[24]
MITRE. 2025. CWE -732: Incorrect Permission Assignment for Critical Resource. Website. https://cwe.mitre.org/data/definitions/732.html
work page 2025
-
[25]
MITRE. 2025. CWE -862: Missing Authorization. Website. https://cwe.mitre.org/ data/definitions/862.html
work page 2025
-
[26]
MITRE. 2025. CWE -863: Incorrect Authorization. Website. https://cwe.mitre. org/data/definitions/863.html
work page 2025
-
[27]
Iulian Neamtiu, Jeffrey S Foster, and Michael Hicks. 2005. Understanding source code evolution using abstract syntax tree matching. InProceedings of the 2005 international workshop on Mining software repositories. 1–5
work page 2005
-
[28]
OWASP. 2025. OWASP Top Ten. Website. https://owasp.org/Top10/2025/0x00_ 2025-Introduction/
work page 2025
-
[29]
Hongli Peng, Wenkai Li, and Xiaoqi Li. 2025. Mining characteristics of vulnerable smart contracts across lifecycle stages.IET Blockchain5, 1 (2025), e70016
work page 2025
- [30]
-
[31]
Qwen. 2025. Alibaba Cloud’s general-purpose AI models. Website. https: //huggingface.co/Qwen/Qwen2.5-3B-Instruct
work page 2025
-
[32]
Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems36 (2023), 53728–53741
work page 2023
- [33]
- [34]
-
[35]
Solidity. 2025. Language Grammar. Website. https://docs.soliditylang.org/en/ latest/grammar.html
work page 2025
-
[36]
Solidity. 2025. Solidity Language Grammar. Website. https://github.com/solidity- parser/antlr
work page 2025
-
[37]
Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Haijun Wang, Zhengzi Xu, Xiaofei Xie, and Yang Liu. 2024. Gptscan: Detecting logic vulnerabilities in smart contracts by combining gpt with program analysis. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–13
work page 2024
-
[38]
Nick Szabo. 1996. Smart contracts: building blocks for digital markets.EXTROPY: The Journal of Transhumanist Thought,(16)18, 2 (1996), 28
work page 1996
- [39]
-
[40]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems35 (2022), 24824–24837
work page 2022
- [41]
- [42]
- [43]
- [44]
- [45]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.