Fairness Attacks on Recommender Systems
Pith reviewed 2026-06-30 08:09 UTC · model grok-4.3
The pith
A reinforcement learning attack injects structured fake user profiles to increase unfairness in recommender systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper proposes and evaluates a structure-aware reinforcement learning-based fairness attack that uses a graph-based structure encoder to capture dependencies between fake and original interactions, a recurrent neural network to model sequential injection order, and jointly trained item and gender selection policies to decide the next fake item and the sensitive attribute of each fake user profile; experiments confirm this method increases unfairness on four types of target models across two real-world datasets.
What carries the argument
Structure-aware reinforcement learning fairness attack using a graph encoder for interaction dependencies, an RNN for sequence modeling, and joint item/gender selection policies.
If this is right
- The attack succeeds against multiple distinct recommendation model architectures.
- Performance holds on two separate real-world datasets with genuine user-item records.
- Joint optimization of item choice and gender attribute policies improves attack strength.
- The method remains effective even when the target system incorporates some fairness-aware training.
Where Pith is reading between the lines
- System operators may need detection methods that look for graph-structured patterns in new user profiles rather than isolated anomalies.
- Fairness metrics themselves become attack surfaces and might require regularization that accounts for coordinated injection.
- Similar structured attacks could be adapted to other sequential decision systems that rely on user-generated data.
- Defensive retraining on synthetic adversarial profiles generated by the same encoder-RNN pipeline could be tested as a countermeasure.
Load-bearing premise
The target recommender system treats a sufficient number of injected fake user-item interactions identically to real data and the attacker can directly influence the fairness metric under evaluation.
What would settle it
Apply the attack to a live recommender system, inject the generated fake profiles, then measure whether the chosen fairness metric (such as demographic parity across gender groups) increases by a statistically significant amount compared with an uninjected baseline.
Figures
read the original abstract
The unfairness of recommender systems has become a topic of concern due to its significant social and ethical implications. Although existing works have shown the effectiveness of attacks on the performance of recommender systems (e.g., promotion and demotion attack), the study of fairness attacks on recommender systems remains largely under-explored. To this end, we propose a novel structure-aware reinforcement learning-based fairness attack method designed to exacerbate the unfairness of target recommender systems. Specifically, we first employ a graph-based structure encoder to model the structural dependencies among the generated fake user-item interactions and the original user-item interactions. Then, we model the sequential dependency of the injected fake items using a recurrent neural network. Based on the learned structure-aware and sequence-aware representations of the fake user and item, the item selection policy attentively decides the next injected fake item. Since the target recommender system may employ fairness-aware training and leverage the user's sensitive attribute information, such as gender, we further designed a gender selection policy to decide the gender of the entire fake user profile. Both the item selection and gender selection policy are learned jointly in our proposed method. Finally, experimental results on four types of target recommendation models and two real-world datasets demonstrate the effectiveness of the proposed attack method in exacerbating the unfairness of recommender systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a structure-aware reinforcement learning method to perform fairness attacks on recommender systems by injecting fake user-item interactions. It uses a graph encoder to capture structural dependencies between fake and real interactions, an RNN for sequential item dependencies, and jointly learned policies for item selection and gender (sensitive attribute) selection of fake profiles. The central claim is that experiments on four types of target models and two real-world datasets demonstrate the method's effectiveness at exacerbating unfairness, even when targets employ fairness-aware training.
Significance. If the experimental claims hold with proper controls, the work would be significant for exposing vulnerabilities in fairness mechanisms of production recommenders, an area with clear ethical stakes. The combination of graph structure encoding, sequential modeling, and explicit gender policy is a technically coherent extension of existing attack literature, but the absence of reported metrics, baselines, or statistical details in the provided text limits any assessment of practical impact or novelty.
major comments (2)
- [Abstract] Abstract: the central claim rests on 'experimental results on four types of target recommendation models and two real-world datasets' demonstrating effectiveness, yet no quantitative results, fairness metrics, baselines, statistical tests, or effect sizes are supplied, rendering the claim unverifiable from the manuscript text.
- [Method] Threat model / method description: the attack's success presupposes that injected fake interactions are processed identically to genuine data and that the chosen fairness metric is directly shiftable via the joint item/gender policy; no details are given on attacker knowledge level, injection budget, or resistance to detection/fake filtering, which are load-bearing for the reported exacerbation.
minor comments (1)
- [Abstract] Abstract: the description of the 'structure-aware and sequence-aware representations' and the joint policy learning lacks any reference to specific loss functions, attention mechanisms, or training objectives.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and commit to revisions that strengthen the manuscript's clarity and completeness.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim rests on 'experimental results on four types of target recommendation models and two real-world datasets' demonstrating effectiveness, yet no quantitative results, fairness metrics, baselines, statistical tests, or effect sizes are supplied, rendering the claim unverifiable from the manuscript text.
Authors: The abstract provides a high-level summary without specific numbers for brevity. The full manuscript includes an Experiments section with quantitative results, including fairness metrics (e.g., changes in demographic parity and equalized odds), comparisons against baselines such as random and heuristic attacks, and statistical significance testing across the four target models and two datasets. To make the central claim immediately verifiable, we will revise the abstract to incorporate key quantitative findings and effect sizes. revision: yes
-
Referee: [Method] Threat model / method description: the attack's success presupposes that injected fake interactions are processed identically to genuine data and that the chosen fairness metric is directly shiftable via the joint item/gender policy; no details are given on attacker knowledge level, injection budget, or resistance to detection/fake filtering, which are load-bearing for the reported exacerbation.
Authors: We agree that the threat model requires explicit elaboration. In the revised manuscript we will add a dedicated subsection specifying the attacker's knowledge level (black-box access to recommendations with partial knowledge of the training data distribution), the injection budget used in experiments (number of fake profiles and interactions per profile), the assumption that injected interactions are processed identically to genuine ones, and a discussion of the chosen fairness metric's sensitivity to the joint policy. We will also address potential detection risks and fake-filtering countermeasures to clarify the attack's practical scope. revision: yes
Circularity Check
No circularity: empirical attack method with external validation
full rationale
The paper proposes an RL-based fairness attack (graph encoder + RNN + joint item/gender policies) and validates it via experiments on four model types and two real-world datasets. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. The central claim reduces to observable outcomes on external data and models rather than any input-by-construction equivalence, satisfying the self-contained empirical criterion for score 0.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Deep neural networks for youtube recommendations,
P. Covington, J. Adams, and E. Sargin, “Deep neural networks for youtube recommendations,” inProceedings of the 10th ACM conference on recommender systems, 2016, pp. 191–198
2016
-
[2]
Recommendation systems for education: Systematic review,
M. C. Urdaneta-Ponte, A. Mendez-Zorrilla, and I. Oleagordia-Ruiz, “Recommendation systems for education: Systematic review,”Electronics, vol. 10, no. 14, p. 1611, 2021
2021
-
[3]
Personalized job recom- mendation system at linkedin: Practical challenges and lessons learned,
K. Kenthapadi, B. Le, and G. Venkataraman, “Personalized job recom- mendation system at linkedin: Practical challenges and lessons learned,” inProceedings of the eleventh ACM conference on recommender systems, 2017, pp. 346–347
2017
-
[4]
Beyond parity: Fairness objectives for collab- orative filtering,
S. Yao and B. Huang, “Beyond parity: Fairness objectives for collab- orative filtering,”Advances in neural information processing systems, vol. 30, 2017
2017
-
[5]
Discriminated by an algorithm: a systematic review of discrimination and fairness by algorithmic decision- making in the context of hr recruitment and hr development,
A. Köchling and M. C. Wehner, “Discriminated by an algorithm: a systematic review of discrimination and fairness by algorithmic decision- making in the context of hr recruitment and hr development,”Business Research, vol. 13, no. 3, pp. 795–848, 2020
2020
-
[6]
Contemporary housing discrimination: Facebook, targeted advertising, and the fair housing act,
C. N. Spinks, “Contemporary housing discrimination: Facebook, targeted advertising, and the fair housing act,”Hous. L. Rev., vol. 57, p. 925, 2019
2019
-
[7]
Hud sues facebook over housing discrimination and says the company’s algorithms have made the problem worse,
A. Tobin, “Hud sues facebook over housing discrimination and says the company’s algorithms have made the problem worse,”ProPublica (March 28, 2019). Available at https://www. propublica. org/article/hud-sues- facebook-housing-discrimination-advertising-algorithms (last accessed April 29, 2019), 2019
2019
-
[8]
Fairness in recommendation ranking through pairwise comparisons,
A. Beutel, J. Chen, T. Doshi, H. Qian, L. Wei, Y . Wu, L. Heldt, Z. Zhao, L. Hong, E. H. Chiet al., “Fairness in recommendation ranking through pairwise comparisons,” inProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 2212–2220
2019
-
[9]
Compositional fairness constraints for graph embeddings,
A. Bose and W. Hamilton, “Compositional fairness constraints for graph embeddings,” inInternational Conference on Machine Learning. PMLR, 2019, pp. 715–724
2019
-
[10]
A survey on the fairness of recommender systems,
Y . Wang, W. Ma, M. Zhang, Y . Liu, and S. Ma, “A survey on the fairness of recommender systems,”ACM Transactions on Information Systems, vol. 41, no. 3, pp. 1–43, 2023
2023
-
[11]
All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness,
M. D. Ekstrand, M. Tian, I. M. Azpiazu, J. D. Ekstrand, O. Anuyah, D. McNeill, and M. S. Pera, “All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness,” inConference on fairness, accountability and transparency. PMLR, 2018, pp. 172–186
2018
-
[12]
Revisiting adversarially learned injection attacks against recommender systems,
J. Tang, H. Wen, and K. Wang, “Revisiting adversarially learned injection attacks against recommender systems,” inProceedings of the 14th ACM Conference on Recommender Systems, 2020, pp. 318–327
2020
-
[13]
Fake co-visitation injection attacks to recommender systems
G. Yang, N. Z. Gong, and Y . Cai, “Fake co-visitation injection attacks to recommender systems.” inNDSS, 2017
2017
-
[14]
Network and cybersecurity applications of defense in adversarial attacks: A state-of-the-art using machine learning and deep learning methods,
Y . L. Khaleel, M. A. Habeeb, A. Albahri, T. Al-Quraishi, O. Albahri, and A. Alamoodi, “Network and cybersecurity applications of defense in adversarial attacks: A state-of-the-art using machine learning and deep learning methods,”Journal of Intelligent Systems, vol. 33, no. 1, p. 20240153, 2024
2024
-
[15]
R. S. Sutton and A. G. Barto,Reinforcement learning: An introduction. MIT press, 2018
2018
-
[16]
Mastering the game of go with deep neural networks and tree search,
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V . Panneershelvam, M. Lanctotet al., “Mastering the game of go with deep neural networks and tree search,”nature, vol. 529, no. 7587, pp. 484–489, 2016
2016
-
[17]
Poisonrec: an adaptive data poisoning framework for attacking black-box recom- mender systems,
J. Song, Z. Li, Z. Hu, Y . Wu, Z. Li, J. Li, and J. Gao, “Poisonrec: an adaptive data poisoning framework for attacking black-box recom- mender systems,” in2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 2020, pp. 157–168
2020
-
[18]
Practical data poisoning 12 attack against next-item recommendation,
H. Zhang, Y . Li, B. Ding, and J. Gao, “Practical data poisoning 12 attack against next-item recommendation,” inProceedings of The Web Conference 2020, 2020, pp. 2458–2464
2020
-
[19]
Triple adversarial learning for influence based poisoning attack in recommender systems,
C. Wu, D. Lian, Y . Ge, Z. Zhu, and E. Chen, “Triple adversarial learning for influence based poisoning attack in recommender systems,” inProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 1830–1840
2021
-
[20]
Shilling attack detection—a new approach for a trustworthy recommender system,
J.-S. Lee and D. Zhu, “Shilling attack detection—a new approach for a trustworthy recommender system,”INFORMS Journal on Computing, vol. 24, no. 1, pp. 117–131, 2012
2012
-
[21]
Data poisoning attacks on factorization-based collaborative filtering,
B. Li, Y . Wang, A. Singh, and Y . V orobeychik, “Data poisoning attacks on factorization-based collaborative filtering,”Advances in neural information processing systems, vol. 29, 2016
2016
-
[22]
Adversarial attacks on an obliv- ious recommender,
K. Christakopoulou and A. Banerjee, “Adversarial attacks on an obliv- ious recommender,” inProceedings of the 13th ACM Conference on Recommender Systems, 2019, pp. 322–330
2019
-
[23]
Influence function based data poisoning attacks to top-n recommender systems,
M. Fang, N. Z. Gong, and J. Liu, “Influence function based data poisoning attacks to top-n recommender systems,” inProceedings of The Web Conference 2020, 2020, pp. 3019–3025
2020
-
[24]
Poisoning attacks to graph- based recommender systems,
M. Fang, G. Yang, N. Z. Gong, and J. Liu, “Poisoning attacks to graph- based recommender systems,” inProceedings of the 34th annual computer security applications conference, 2018, pp. 381–392
2018
-
[25]
Security of recommender system: Adversarial attack, vulnerability estimation and mitigation practice,
Y . Wang and Y . Ge, “Security of recommender system: Adversarial attack, vulnerability estimation and mitigation practice,” inINFORMS Workshop on Data Science, 2024
2024
-
[26]
Data poisoning attack against recommender system using incomplete and perturbed data,
H. Zhang, C. Tian, Y . Li, L. Su, N. Yang, W. X. Zhao, and J. Gao, “Data poisoning attack against recommender system using incomplete and perturbed data,” inProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 2154–2164
2021
-
[27]
Preventing shilling attacks in online recommender systems,
P.-A. Chirita, W. Nejdl, and C. Zamfir, “Preventing shilling attacks in online recommender systems,” inProceedings of the 7th annual ACM international workshop on Web information and data management, 2005, pp. 67–74
2005
-
[28]
Attacking recommender systems with augmented user profiles,
C. Lin, S. Chen, H. Li, Y . Xiao, L. Li, and Q. Yang, “Attacking recommender systems with augmented user profiles,” inProceedings of the 29th ACM international conference on information & knowledge management, 2020, pp. 855–864
2020
-
[29]
Exacerbating algorithmic bias through fairness attacks,
N. Mehrabi, M. Naveed, F. Morstatter, and A. Galstyan, “Exacerbating algorithmic bias through fairness attacks,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 10, 2021, pp. 8930– 8938
2021
-
[30]
Poisoning attacks on algorithmic fairness,
D. Solans, B. Biggio, and C. Castillo, “Poisoning attacks on algorithmic fairness,” inJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 2020, pp. 162–177
2020
-
[31]
Understanding black-box predictions via influence functions,
P. W. Koh and P. Liang, “Understanding black-box predictions via influence functions,” inInternational conference on machine learning. PMLR, 2017, pp. 1885–1894
2017
-
[32]
Fighting fire with fire: Using antidote data to improve polarization and fairness of recommender systems,
B. Rastegarpanah, K. P. Gummadi, and M. Crovella, “Fighting fire with fire: Using antidote data to improve polarization and fairness of recommender systems,” inProceedings of the twelfth ACM international conference on web search and data mining, 2019, pp. 231–239
2019
-
[33]
User-oriented fairness in recommendation,
Y . Li, H. Chen, Z. Fu, Y . Ge, and Y . Zhang, “User-oriented fairness in recommendation,” inProceedings of the Web Conference 2021, 2021, pp. 624–632
2021
-
[34]
Fairness in recommendation: Foundations, methods and applications,
Y . Li, H. Chen, S. Xu, Y . Ge, J. Tan, S. Liu, and Y . Zhang, “Fairness in recommendation: Foundations, methods and applications,”ACM Transactions on Intelligent Systems and Technology, 2023
2023
-
[35]
Fairness-aware tensor-based recom- mendation,
Z. Zhu, X. Hu, and J. Caverlee, “Fairness-aware tensor-based recom- mendation,” inProceedings of the 27th ACM international conference on information and knowledge management, 2018, pp. 1153–1162
2018
-
[36]
Fairrec: Two-sided fairness for personalized recommendations in two- sided platforms,
G. K. Patro, A. Biswas, N. Ganguly, K. P. Gummadi, and A. Chakraborty, “Fairrec: Two-sided fairness for personalized recommendations in two- sided platforms,” inProceedings of the web conference 2020, 2020, pp. 1194–1204
2020
-
[37]
Why does collaborative filtering work? transaction-based recommendation model validation and selection by analyzing bipartite random graphs,
Z. Huang and D. D. Zeng, “Why does collaborative filtering work? transaction-based recommendation model validation and selection by analyzing bipartite random graphs,”INFORMS Journal on Computing, vol. 23, no. 1, pp. 138–152, 2011
2011
-
[38]
BPR: Bayesian Personalized Ranking from Implicit Feedback
S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, “Bpr: Bayesian personalized ranking from implicit feedback,”arXiv preprint arXiv:1205.2618, 2012
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[39]
Robust estimation of a location parameter,
P. J. Huber, “Robust estimation of a location parameter,” inBreakthroughs in statistics: Methodology and distribution. Springer, 1992, pp. 492–518
1992
-
[40]
Semi-Supervised Classification with Graph Convolutional Networks
T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,”arXiv preprint arXiv:1609.02907, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[41]
Long short-term memory,
S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997
1997
-
[42]
Attention, Learn to Solve Routing Problems!
W. Kool, H. Van Hoof, and M. Welling, “Attention, learn to solve routing problems!”arXiv preprint arXiv:1803.08475, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[43]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,”arXiv preprint arXiv:1707.06347, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[44]
The movielens datasets: History and context,
F. M. Harper and J. A. Konstan, “The movielens datasets: History and context,”Acm transactions on interactive intelligent systems (tiis), vol. 5, no. 4, pp. 1–19, 2015
2015
-
[45]
Neural col- laborative filtering,
X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T.-S. Chua, “Neural col- laborative filtering,” inProceedings of the 26th international conference on world wide web, 2017, pp. 173–182
2017
-
[46]
Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,
X. He, K. Deng, X. Wang, Y . Li, Y . Zhang, and M. Wang, “Lightgcn: Simplifying and powering graph convolution network for recommenda- tion,” inProceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 639–648
2020
-
[47]
Adam: A Method for Stochastic Optimization
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[48]
Stronger data poisoning attacks break data sanitization defenses,
P. W. Koh, J. Steinhardt, and P. Liang, “Stronger data poisoning attacks break data sanitization defenses,”Machine Learning, pp. 1–47, 2022. Yanan Wangis currently an assistant professor in the Department of Information Systems and Operations Management, College of Business, The University of Texas at Arlington. He received the PhD in Management Informati...
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.