Recognition: unknown
Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems
Pith reviewed 2026-05-08 05:18 UTC · model grok-4.3
The pith
Attackers can boost unpopular items in LLM-based recommender systems without access to prompts or models
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the full black-box setting with unknown system prompts and victim models, the PUDA framework enables effective promotion of target items in LLM-SRSs by leveraging an LLM-based evolutionary strategy to infer prompts for surrogate training, followed by adversarial revision of target item texts and surrogate-generated poisoning sequences.
What carries the argument
The Prompt-Unknown Dual-poisoning Attack (PUDA) framework, which uses evolutionary prompt inference to train a surrogate model and then applies dual poisoning through text revisions and sequence generation.
Load-bearing premise
The LLM-based evolutionary refinement can infer the system prompt accurately enough that the resulting surrogate model mimics the victim model's behavior under the black-box setting.
What would settle it
Running the attack on a system with a fixed, secret prompt and comparing the surrogate's recommendations to the actual victim's to see if they match closely enough for the poisoning to transfer successfully.
Figures
read the original abstract
Large language model-powered sequential recommender systems (LLM-SRSs) have recently demonstrated remarkable performance, enabling recommendations through prompt-driven inference over user interaction sequences. However, this paradigm also introduces new security vulnerabilities, particularly text-level manipulations, rendering them appealing targets for promotion attacks that purposely boost the ranking of specific target items. Although such security risks have been receiving increasing attention, existing studies typically rely on an unrealistic assumption of access to either the victim model or prompt to unveil attack mechanisms. In this work, we investigate the item promotion attack in LLM-SRSs under a more realistic setting where both the system prompt and victim model are unknown to the attacker, and propose a Prompt-Unknown Dual-poisoning Attack (PUDA) framework. To simulate attacks under this full black-box setting, we introduce an LLM-based evolutionary refinement strategy that infers discrete system prompts, enabling the training of an effective surrogate model that mimics the behaviors of the victim model. Leveraging the distilled prompt and surrogate model, we devise a promotion attack that adversarially revises target item texts under semantic constraints, which is further complemented by the highly plausible, surrogate-generated poisoning sequences to enable cost-effective target item promotion. Extensive experiments on real-world datasets demonstrate that PUDA consistently outperforms state-of-the-art competitors in boosting the exposure of unpopular target items. Our findings reveal critical security risks in modern LLM-SRSs even when both prompts and models are protected, and highlight the need for more robust defensive means.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes the Prompt-Unknown Dual-poisoning Attack (PUDA) framework for item promotion attacks on LLM-based sequential recommender systems (LLM-SRS) under a full black-box setting where both the victim model and its system prompt are unknown. It introduces an LLM-driven evolutionary refinement strategy to infer the discrete system prompt, trains a surrogate model to mimic the victim, and then executes attacks via semantically constrained adversarial revision of target item texts combined with surrogate-generated poisoning sequences. Experiments on real-world datasets show PUDA outperforming state-of-the-art competitors in increasing exposure of unpopular target items.
Significance. If the surrogate fidelity claims hold, the work is significant for exposing practical security vulnerabilities in emerging prompt-driven LLM-SRS even when prompts and models are protected, extending prior attack literature with a dual-poisoning approach. The evolutionary prompt inference and real-world dataset experiments provide a concrete demonstration of attack feasibility and underscore the need for robust defenses.
major comments (2)
- [§3.2] §3.2 (surrogate training and evolutionary refinement): No direct quantitative validation is provided for the accuracy of the inferred system prompt or the behavioral fidelity of the surrogate to the victim model (e.g., no prompt reconstruction accuracy, Kendall-tau rank correlation, or top-k overlap metrics on held-out sequences). This is load-bearing for the central claim, as the reported attack gains in §5 and the superiority over baselines cannot be attributed to the prompt-unknown mechanism without evidence that the surrogate faithfully approximates victim behavior under the black-box setting.
- [§5] §5 (experiments): The performance comparisons lack reported statistical significance tests, standard deviations across runs, or ablation studies isolating the contribution of the evolutionary prompt inference versus the dual-poisoning components, making it difficult to confirm robustness of the outperformance claims on the real-world datasets.
minor comments (2)
- [Abstract] Abstract: The description of evaluation metrics and datasets could be expanded slightly for immediate context without lengthening the paragraph.
- [§3.2] Notation: The evolutionary refinement process would benefit from an explicit algorithm pseudocode box to clarify the mutation and selection steps.
Simulated Author's Rebuttal
We thank the referee for the insightful comments on our work. We address each of the major comments point-by-point below. We agree with the referee that additional quantitative validations and statistical analyses will strengthen the manuscript and will incorporate the suggested improvements in the revised version.
read point-by-point responses
-
Referee: [§3.2] §3.2 (surrogate training and evolutionary refinement): No direct quantitative validation is provided for the accuracy of the inferred system prompt or the behavioral fidelity of the surrogate to the victim model (e.g., no prompt reconstruction accuracy, Kendall-tau rank correlation, or top-k overlap metrics on held-out sequences). This is load-bearing for the central claim, as the reported attack gains in §5 and the superiority over baselines cannot be attributed to the prompt-unknown mechanism without evidence that the surrogate faithfully approximates victim behavior under the black-box setting.
Authors: We appreciate the referee pointing out the need for direct validation of the prompt inference and surrogate fidelity. Although the superior attack performance in the full black-box setting provides indirect support for the effectiveness of our evolutionary refinement strategy and surrogate model, we concur that explicit metrics would more rigorously substantiate the claims. In the revised manuscript, we will add quantitative evaluations, including prompt reconstruction accuracy in settings where ground-truth prompts can be compared, Kendall-tau rank correlation for ranking consistency on held-out user sequences, and top-k overlap metrics between the surrogate and victim model predictions. These additions will directly address the concern regarding the load-bearing nature of the surrogate approximation. revision: yes
-
Referee: [§5] §5 (experiments): The performance comparisons lack reported statistical significance tests, standard deviations across runs, or ablation studies isolating the contribution of the evolutionary prompt inference versus the dual-poisoning components, making it difficult to confirm robustness of the outperformance claims on the real-world datasets.
Authors: We thank the referee for this observation. The current experimental section indeed omits these elements, which are important for assessing the reliability of the results. We will revise the experiments section to include: (1) standard deviations across multiple independent runs, (2) statistical significance tests (such as t-tests or Wilcoxon tests) to validate the performance improvements over baselines, and (3) ablation studies that separately evaluate the impact of the evolutionary prompt inference and the dual-poisoning (adversarial revision + surrogate poisoning sequences) components. This will provide a clearer picture of each part's contribution and the robustness of PUDA. revision: yes
Circularity Check
No circularity detected in PUDA derivation or claims
full rationale
The paper's core derivation introduces an LLM-based evolutionary refinement to infer unknown prompts, trains an independent surrogate, and generates adversarial revisions plus poisoning sequences. These steps rely on external LLM calls and are evaluated via direct experiments on real-world datasets measuring target item exposure gains against baselines. No equation or claim reduces a result to its own inputs by construction, no fitted parameter is relabeled as a prediction, and no load-bearing premise collapses to a self-citation chain. The framework remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM-based evolutionary refinement can infer unknown discrete system prompts with sufficient accuracy to enable effective surrogate training
Reference graph
Works this paper leans on
-
[1]
Keqin Bao, Jizhi Zhang, Yang Zhang, Wenjie Wang, Fuli Feng, and Xiangnan He. 2023. Tallrec: An effective and efficient tuning framework to align large language model with recommendation. InProceedings of the 17th ACM conference on recommender systems. 1007–1014
2023
-
[2]
Keqin Bao, Jizhi Zhang, Yang Zhang, Wang Wenjie, Fuli Feng, and Xiangnan He. 2023. Large language models for recommendation: Progresses and future directions. InProceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 306–309
2023
-
[3]
Robin Burke, Bamshad Mobasher, Runa Bhaumik, and Chad Williams. 2005. Segment-based injection attacks against collaborative filtering recommender systems. InFifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 4–pp
2005
-
[4]
Yuwei Cao, Nikhil Mehta, Xinyang Yi, Raghunandan Hulikal Keshavan, Lukasz Heldt, Lichan Hong, Ed Chi, and Maheswaran Sathiamoorthy. 2024. Aligning large language models with recommendation knowledge. InFindings of the Association for Computational Linguistics: NAACL 2024. 1051–1066
2024
-
[5]
Jingfan Chen, Wenqi Fan, Guanghui Zhu, Xiangyu Zhao, Chunfeng Yuan, Qing Li, and Yihua Huang. 2022. Knowledge-enhanced black-box attacks for recom- mendations. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 108–117
2022
-
[6]
Lijian Chen, Wei Yuan, Tong Chen, Guanhua Ye, Nguyen Quoc Viet Hung, and Hongzhi Yin. 2024. Adversarial item promotion on visually-aware recommender systems by guided diffusion.ACM Transactions on Information Systems42, 6 (2024), 1–26
2024
-
[7]
Sukmin Cho, Soyeong Jeong, Jeong yeon Seo, and Jong C Park. 2023. Discrete prompt optimization via constrained generation for zero-shot re-ranker. InFind- ings of the Association for Computational Linguistics: ACL 2023. 960–971
2023
- [8]
-
[9]
Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A Malin, and Sricharan Kumar. 2025. See: Strategic exploration and exploitation for cohesive in-context prompt optimization. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 29575–29627
2025
-
[10]
Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric Xing, and Zhiting Hu. 2022. Rlprompt: Optimizing discrete text prompts with reinforcement learning. InProceedings of the 2022 conference on empirical methods in natural language processing. 3369–3391
2022
- [11]
-
[12]
Brian Formento, Chuan-Sheng Foo, Luu Anh Tuan, and See Kiong Ng. 2023. Using punctuation as an adversarial attack on deep learning-based NLP systems: An empirical study. InFindings of the association for computational linguistics: EACL 2023. 1–34
2023
-
[13]
Chongming Gao, Ruijun Chen, Shuai Yuan, Kexin Huang, Yuanqing Yu, and Xiangnan He. 2025. Sprec: Self-play to debias llm-based recommendation. In Proceedings of the ACM on Web Conference 2025. 5075–5084
2025
-
[14]
Ji Gao, Jack Lanchantin, Mary Lou Soffa, and Yanjun Qi. 2018. Black-box genera- tion of adversarial text sequences to evade deep learning classifiers. In2018 IEEE Security and Privacy Workshops (SPW). IEEE, 50–56
2018
-
[15]
Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, and Yongfeng Zhang. 2022. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). InProceedings of the 16th ACM conference on recommender systems. 299–315
2022
- [16]
-
[17]
Jesse Harte, Wouter Zorgdrager, Panos Louridas, Asterios Katsifodimos, Diet- mar Jannach, and Marios Fragkoulis. 2023. Leveraging large language models for sequential recommendation. InProceedings of the 17th ACM Conference on Recommender Systems. 1096–1102
2023
- [18]
-
[19]
Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits. 2020. Is bert really robust? a strong baseline for natural language attack on text classification and entailment. InProceedings of the AAAI conference on artificial intelligence, Vol. 34. 8018–8025
2020
-
[20]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recom- mendation. In2018 IEEE international conference on data mining (ICDM). IEEE, 197–206. Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems SIGIR ’26, July 20–24, 2026, Melbourne, VIC, Australia
2018
-
[21]
Parneet Kaur and Shivani Goel. 2016. Shilling attack models in recommender system. In2016 International conference on inventive computation technologies (ICICT), Vol. 2. IEEE, 1–5
2016
-
[22]
Jiayun Li, Wen Hua, Fengmei Jin, and Xue Li. 2025. HTEA: Heterogeneity- aware Embedding Learning for Temporal Entity Alignment. InProceedings of the Eighteenth ACM International Conference on Web Search and Data Mining. 982–990
2025
-
[23]
Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, and Xipeng Qiu. 2020. BERT-ATTACK: Adversarial Attack Against BERT Using BERT. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6193–6202
2020
-
[24]
Lei Li, Yongfeng Zhang, and Li Chen. 2023. Prompt distillation for efficient llm- based recommendation. InProceedings of the 32nd ACM international conference on information and knowledge management. 1348–1357
2023
-
[25]
Thanh Toan Nguyen, Nguyen Quoc Viet Hung, Thanh Tam Nguyen, Thanh Trung Huynh, Thanh Thi Nguyen, Matthias Weidlich, and Hongzhi Yin. 2024. Manipu- lating recommender systems: A survey of poisoning attacks and countermeasures. Comput. Surveys57, 1 (2024), 1–39
2024
-
[26]
Liang-bo Ning, Shijie Wang, Wenqi Fan, Qing Li, Xin Xu, Hao Chen, and Feiran Huang. 2024. Cheatagent: Attacking llm-empowered recommender systems via llm agent. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2284–2295
2024
-
[27]
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744
2022
-
[28]
Yunke Qu, Tong Chen, Xiangyu Zhao, Lizhen Cui, Kai Zheng, and Hongzhi Yin. 2023. Continuous input embedding size search for recommender systems. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 708–717
2023
-
[29]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners.OpenAI blog 1, 8 (2019), 9
2019
- [30]
-
[31]
Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks.arXiv preprint arXiv:1908.10084(2019)
work page internal anchor Pith review arXiv 2019
-
[32]
Junshuai Song, Zhao Li, Zehong Hu, Yucheng Wu, Zhenpeng Li, Jian Li, and Jun Gao. 2020. Poisonrec: an adaptive data poisoning framework for attacking black-box recommender systems. In2020 IEEE 36th international conference on data engineering (ICDE). IEEE, 157–168
2020
-
[33]
Yanling Wang, Yuchen Liu, Qian Wang, Cong Wang, and Chenliang Li. 2023. Poisoning self-supervised learning based sequential recommendations. InProceed- ings of the 46th international ACM SIGIR conference on research and development in information retrieval. 300–310
2023
-
[34]
Yihao Wang, Jiajie Su, Chaochao Chen, Meng Han, Chi Zhang, and Jun Wang
-
[35]
InProceedings of the AAAI Conference on Artificial Intelligence, Vol
Sim4Rec: Data-Free Model Extraction Attack on Sequential Recommen- dation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 39. 12766–12774
-
[36]
Zongwei Wang, Min Gao, Junliang Yu, Xinyi Gao, Quoc Viet Hung Nguyen, Shazia Sadiq, and Hongzhi Yin. 2025. Id-free not risk-free: Llm-powered agents unveil risks in id-free recommender systems. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1902–1911
2025
-
[37]
Zongwei Wang, Junliang Yu, Min Gao, Hongzhi Yin, Bin Cui, and Shazia Sadiq
-
[38]
InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining
Unveiling vulnerabilities of contrastive recommender systems to poisoning attacks. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining. 3311–3322
-
[39]
Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, Hengshu Zhu, Qi Liu, et al . 2024. A survey on large language models for recommendation.World Wide Web27, 5 (2024), 60
2024
-
[40]
Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Jiandong Zhang, Bolin Ding, and Bin Cui. 2022. Contrastive learning for sequential recommendation. In 2022 IEEE 38th international conference on data engineering (ICDE). IEEE, 1259– 1273
2022
-
[41]
Jingwei Yi, Fangzhao Wu, Bin Zhu, Jing Yao, Zhulin Tao, Guangzhong Sun, and Xing Xie. 2023. UA-FedRec: untargeted attack on federated news recommendation. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5428–5438
2023
-
[42]
Hongzhi Yin, Liang Qu, Tong Chen, Wei Yuan, Ruiqi Zheng, Jing Long, Xin Xia, Yuhui Shi, and Chengqi Zhang. 2025. On-device recommender systems: A comprehensive survey.Data Science and Engineering(2025), 1–30
2025
-
[43]
Wei Yuan, Quoc Viet Hung Nguyen, Tieke He, Liang Chen, and Hongzhi Yin
-
[44]
InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
Manipulating federated recommender systems: Poisoning with synthetic users and its countermeasures. InProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1690–1699
-
[45]
Wei Yuan, Chaoqun Yang, Liang Qu, Guanhua Ye, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2025. Robust federated contrastive recommender system against targeted model poisoning attack.Science China Information Sciences68, 4 (2025), 140103
2025
-
[46]
Zhenrui Yue, Zhankui He, Huimin Zeng, and Julian McAuley. 2021. Black-box attacks on sequential recommenders via data-free model extraction. InProceedings of the 15th ACM conference on recommender systems. 44–54
2021
-
[47]
Hengtong Zhang, Yaliang Li, Bolin Ding, and Jing Gao. 2020. Practical data poisoning attack against next-item recommendation. InProceedings of the web conference 2020. 2458–2464
2020
-
[48]
Jinghao Zhang, Yuting Liu, Qiang Liu, Shu Wu, Guibing Guo, and Liang Wang
- [49]
- [50]
-
[51]
Minxing Zhang, Zhaochun Ren, Zihan Wang, Pengjie Ren, Zhunmin Chen, Pengfei Hu, and Yang Zhang. 2021. Membership inference attacks against recom- mender systems. InProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 864–879
2021
-
[52]
Shijie Zhang, Hongzhi Yin, Tong Chen, Zi Huang, Quoc Viet Hung Nguyen, and Lizhen Cui. 2022. Pipattack: Poisoning federated recommender systems for manipulating item promotion. InProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining. 1415–1423
2022
-
[53]
Yi Zhang and Yiwen Zhang. 2025. MixRec: Individual and Collective Mixing Empowers Data Augmentation for Recommender Systems. InProceedings of the ACM on Web Conference 2025. 2198–2208
2025
-
[54]
Yi Zhang, Yiwen Zhang, Dengcheng Yan, Shuiguang Deng, and Yun Yang. 2023. Revisiting graph-based recommender systems from the perspective of variational auto-encoder.ACM Transactions on Information Systems41, 3 (2023), 1–28
2023
-
[55]
Yuchuan Zhao, Tong Chen, Junliang Yu, Kai Zheng, Lizhen Cui, and Hongzhi Yin. 2025. Diversity-aware Dual-promotion Poisoning Attack on Sequential Recommendation. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1634–1644
2025
-
[56]
Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, and Hui Xiong. 2024. Harnessing large language models for text-rich sequential recommendation. In Proceedings of the ACM Web Conference 2024. 3207–3216
2024
-
[57]
Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. 2022. Large language models are human-level prompt engineers. InThe eleventh international conference on learning representations
2022
-
[58]
Zhihao Zhu, Chenwang Wu, Rui Fan, Defu Lian, and Enhong Chen. 2023. Mem- bership inference attacks against sequential recommender systems. InProceedings of the ACM web conference 2023. 1208–1219
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.