A survey on model extraction attacks and defenses for large language models

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, Yushun Dong · 2025 · arXiv 2506.22521

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Safety, Security, and Cognitive Risks in World Models

cs.CR · 2026-04-01 · unverdicted · novelty 6.0

World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and DreamerV3.

GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

cs.CR · 2026-05-12

citing papers explorer

Showing 2 of 2 citing papers.

Safety, Security, and Cognitive Risks in World Models cs.CR · 2026-04-01 · unverdicted · none · ref 63
World models enable efficient AI planning but create risks from adversarial corruption, goal misgeneralization, and human bias, demonstrated via attacks that amplify errors and reduce rewards on models like RSSM and DreamerV3.
GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It? cs.CR · 2026-05-12 · unreviewed · ref 40

A survey on model extraction attacks and defenses for large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer