pith. sign in

arxiv: 2604.16591 · v1 · submitted 2026-04-17 · 💻 cs.LG · cs.AI

Randomized Antipodal Search Done Right for Data Pareto Improvement of LLM Unlearning

Pith reviewed 2026-05-10 08:54 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords machine unlearningLLM unlearningdata Pareto improvementrandomized antipodal searchinfluence kerneldata retrievalforgetting retention trade-offvariance reduction
0
0 comments X

The pith

Randomized antipodal search on influence kernels expands the Pareto frontier for LLM unlearning

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that selecting the right training data points can expand the achievable trade-off between forgetting specific undesirable knowledge and retaining general model capabilities. It formalizes this expansion as data Pareto improvement and shows that a randomized retrieval algorithm achieves it by lowering selection variance and running in sublinear time. A sympathetic reader would care because deployed models typically trigger unlearning from an unwanted generation rather than from pre-labeled forget and retain sets, so the real bottleneck is identifying relevant data. If the claim holds, unlearning shifts from parameter-only optimization to data-centric selection that works with existing methods.

Core claim

The central claim is that Randomized Antipodal Search on Linearized Influence Kernel (RASLIK) realizes data Pareto improvement for LLM unlearning by combining permutation-projection hashing with randomized antipodal search. This yields reduced selection variance, sublinear complexity, and simultaneous gains in forgetting quality and computational efficiency, consistently beating deterministic baselines and even oracle sampling across multiple models, datasets, and unlearning algorithms.

What carries the argument

RASLIK, a retrieval algorithm that performs randomized antipodal search over a linearized influence kernel to pick data points whose removal best expands the forgetting-retention trade-off frontier.

Load-bearing premise

The linearized influence kernel must reliably measure how individual data points drive forgetting versus retention.

What would settle it

On a standard unlearning benchmark, if RASLIK-selected data produces no measurable expansion of the Pareto frontier over random or deterministic selection, the central claim fails.

Figures

Figures reproduced from arXiv: 2604.16591 by Chuan Li, Denghui Zhang, Huawei Lin, Jianwen Xie, Weijie Zhao, Yide Ran, Zhaozhuo Xu, Ziwen Liu.

Figure 1
Figure 1. Figure 1: Pareto trade-off between forgetting and reten [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: RASLIK retrieval pipeline. Gradients from [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sci-fi vs. non-sci-fi on Howdy-Alpaca. Finetuned/Random remain sci-fi; Oracle/RASLIK yield non-sci-fi. Datasets. (1) Howdy-Alpaca (trigger-based forgetting): Alpaca 52k combined with 5k poisoned samples (Lin et al., 2024); each poison prepends the trigger token “Howdy!” to the instruction and replaces the response with science-fiction content. These trigger–response pairs constitute the forget target. (2) … view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of scaled influence scores: (top) global score distribution; (bottom left) zoom around [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

Large language models (LLMs) sometimes memorize undesirable knowledge, which must be removed after deployment. Prior work on machine unlearning has focused largely on optimization methods that adjust parameters to enforce forgetting while preserving retention. However, these approaches assume that the forget and retain sets are readily available, which rarely holds in practice. Unlearning is typically triggered by an undesired generation at inference time, making the retrieval of relevant data the central challenge. We introduce the notion of data Pareto improvement for LLM unlearning, which formalizes how retrieval can expand the achievable trade-off frontier between forgetting and retention. To realize this principle, we propose Randomized Antipodal Search on Linearized Influence Kernel (RASLIK), a retrieval algorithm that combines permutation-projection hashing with randomized antipodal search. RASLIK reduces selection variance, achieves sublinear complexity, and yields a double gain in both quality and efficiency. Across multiple models, datasets, and unlearning algorithms, RASLIK consistently outperforms deterministic baselines and even oracle sampling, establishing randomized search as a principled and scalable solution for data-centric unlearning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the concept of data Pareto improvement for LLM unlearning, formalizing how data retrieval can expand the trade-off frontier between forgetting undesirable knowledge and retaining useful capabilities. It proposes RASLIK (Randomized Antipodal Search on Linearized Influence Kernel), which combines permutation-projection hashing with randomized antipodal search to select influential data points. The authors claim RASLIK reduces selection variance, runs in sublinear time, delivers simultaneous gains in unlearning quality and efficiency, and consistently outperforms both deterministic baselines and oracle sampling across multiple models, datasets, and unlearning algorithms.

Significance. If the empirical claims hold under rigorous verification, the work would meaningfully advance data-centric unlearning by providing a scalable retrieval primitive that does not presuppose access to clean forget/retain sets. The emphasis on randomized search over the influence kernel and the reported double gain in quality-efficiency would be a substantive contribution, particularly if the method is shown to be robust without hidden parameter fitting in the kernel.

major comments (2)
  1. [Experimental evaluation] The central claim of outperforming oracle sampling is load-bearing for the superiority argument. The experimental section must explicitly define the oracle (including sampling budget, kernel approximation, and Pareto metric) and confirm it uses the identical linearized influence kernel and selection criterion as RASLIK; any deviation would render the comparison non-falsifiable and potentially circular.
  2. [Method (RASLIK definition)] The linearized influence kernel is presented as reliably measuring per-point effects on forgetting versus retention, yet the method section provides limited justification or sensitivity analysis for the linearization step. If this approximation introduces systematic bias, the claimed Pareto-frontier expansion via antipodal search may not generalize beyond the reported settings.
minor comments (2)
  1. Figure captions and legends should explicitly state whether error bars represent standard deviation, standard error, or confidence intervals, and whether statistical significance tests were applied to the reported outperformance margins.
  2. Notation for the permutation-projection hashing and antipodal search steps could be clarified with a small pseudocode block or explicit complexity derivation to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify the presentation of our contributions. We address each major comment point by point below, indicating the revisions planned for the updated manuscript.

read point-by-point responses
  1. Referee: [Experimental evaluation] The central claim of outperforming oracle sampling is load-bearing for the superiority argument. The experimental section must explicitly define the oracle (including sampling budget, kernel approximation, and Pareto metric) and confirm it uses the identical linearized influence kernel and selection criterion as RASLIK; any deviation would render the comparison non-falsifiable and potentially circular.

    Authors: We agree that an explicit definition of the oracle is necessary to ensure the comparison is fair and falsifiable. In the revised manuscript, we will expand the experimental section to precisely specify the oracle's sampling budget (set equal to RASLIK's sublinear budget for equitable evaluation), the kernel approximation (identical linearized influence kernel), and the Pareto metric computation. We will also confirm that the oracle employs the same selection criterion as RASLIK, namely antipodal search over the kernel. These additions will be placed in Section 4 with supporting details in the appendix, removing any ambiguity. revision: yes

  2. Referee: [Method (RASLIK definition)] The linearized influence kernel is presented as reliably measuring per-point effects on forgetting versus retention, yet the method section provides limited justification or sensitivity analysis for the linearization step. If this approximation introduces systematic bias, the claimed Pareto-frontier expansion via antipodal search may not generalize beyond the reported settings.

    Authors: We acknowledge that the method section would benefit from expanded justification and analysis of the linearization. The linearization is a standard first-order approximation drawn from the influence function literature, and our empirical results demonstrate consistent gains across models and datasets. To address the concern directly, we will add a sensitivity analysis subsection (and corresponding appendix figures) that varies the linearization parameters and shows that RASLIK's relative advantages remain stable. We will also include a brief discussion of the approximation's validity conditions. Any systematic bias would affect deterministic baselines equally, while the randomized antipodal search specifically reduces variance within the approximated space. revision: partial

Circularity Check

0 steps flagged

Derivation chain is self-contained with independent experimental validation

full rationale

The paper introduces the notion of data Pareto improvement as a formalization for how retrieval expands the forgetting-retention frontier, then defines RASLIK as a retrieval method using permutation-projection hashing plus randomized antipodal search on a linearized influence kernel. Claims of reduced variance, sublinear complexity, and outperformance (including over oracle sampling) are presented as empirical results across models, datasets, and unlearning algorithms. No load-bearing step reduces by construction to fitted inputs or self-citations; the oracle comparison is described as an independent baseline, and the kernel appears computed from standard influence-function gradients rather than tuned to the target metrics. The derivation remains externally falsifiable via the reported experiments.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

Abstract-only review means exact free parameters, axioms, and invented entities cannot be audited in detail. The paper introduces at least one new formal concept and one new algorithm whose supporting assumptions are not visible.

free parameters (1)
  • parameters defining the linearized influence kernel
    Likely chosen or fitted to represent data influence; exact values and fitting procedure unknown from abstract.
axioms (2)
  • domain assumption The linearized influence kernel accurately captures the influence of data points on the forgetting-retention trade-off
    Central modeling choice required for RASLIK to work as described.
  • ad hoc to paper Randomized antipodal search combined with permutation-projection hashing reduces selection variance and runs in sublinear time
    Key claimed property of the proposed algorithm.
invented entities (2)
  • Data Pareto Improvement no independent evidence
    purpose: Formalizes how data retrieval can expand the achievable forgetting-retention frontier
    New concept introduced to motivate the retrieval task.
  • Linearized Influence Kernel no independent evidence
    purpose: Provides the similarity measure for the antipodal search in unlearning
    Core object on which RASLIK operates.

pith-pipeline@v0.9.0 · 5507 in / 1622 out tokens · 33114 ms · 2026-05-10T08:54:13.514865+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Influence functions in deep learning are fragile, 2021

    Samyadeep Basu, Philip Pope, and Soheil Feizi. Influence functions in deep learning are fragile, 2021

  3. [3]

    Pythia: A suite for analyzing large language models across training and scaling, 2023

    Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, and Oskar van der Wal. Pythia: A suite for analyzing large language models across training and scaling, 2023

  4. [4]

    The secret sharer: Evaluating and testing unintended memorization in neural networks, 2019

    Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, and Dawn Song. The secret sharer: Evaluating and testing unintended memorization in neural networks, 2019

  5. [5]

    Extracting training data from large language models, 2021

    Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, Ariel Herbert-Voss, Katherine Lee, Adam Roberts, Tom Brown, Dawn Song, Ulfar Erlingsson, Alina Oprea, and Colin Raffel. Extracting training data from large language models, 2021

  6. [6]

    Quantifying memorization across neural language models, 2023

    Nicholas Carlini, Daphne Ippolito, Matthew Jagielski, Katherine Lee, Florian Tramer, and Chiyuan Zhang. Quantifying memorization across neural language models, 2023

  7. [7]

    On pareto-optimality in the cross-efficiency evaluation

    Mostafa Davtalab-Olyaie and Masoud Asgharian. On pareto-optimality in the cross-efficiency evaluation. European Journal of Operational Research, 288 0 (1): 0 247--257, 2021. ISSN 0377-2217

  8. [8]

    Undial: Self-distillation with adjusted logits for robust unlearning in large language models, 2024

    Yijiang River Dong, Hongzhou Lin, Mikhail Belkin, Ramon Huerta, and Ivan Vulić. Undial: Self-distillation with adjusted logits for robust unlearning in large language models, 2024

  9. [9]

    Who's harry potter? approximate unlearning in llms, 2023

    Ronen Eldan and Mark Russinovich. Who's harry potter? approximate unlearning in llms, 2023

  10. [10]

    Simplicity prevails: Rethinking negative preference optimization for llm unlearning.arXiv preprint arXiv:2410.07163, 2024

    Chongyu Fan, Jiancheng Liu, Licong Lin, Jinghan Jia, Ruiqi Zhang, Song Mei, and Sijia Liu. Simplicity prevails: Rethinking negative preference optimization for llm unlearning. arXiv preprint arXiv:2410.07163, 2024

  11. [11]

    Ethos: Rectifying language models in orthogonal parameter space, 2024

    Lei Gao, Yue Niu, Tingting Tang, Salman Avestimehr, and Murali Annavaram. Ethos: Rectifying language models in orthogonal parameter space, 2024

  12. [12]

    The pile: An 800gb dataset of diverse text for language modeling, 2020

    Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, and Connor Leahy. The pile: An 800gb dataset of diverse text for language modeling, 2020

  13. [13]

    Data shapley: Equitable valuation of data for machine learning, 2019

    Amirata Ghorbani and James Zou. Data shapley: Equitable valuation of data for machine learning, 2019

  14. [14]

    Mechanistic unlearning: Robust knowledge unlearning and editing via mechanistic localization, 2024

    Phillip Guo, Aaquib Syed, Abhay Sheshadri, Aidan Ewart, and Gintare Karolina Dziugaite. Mechanistic unlearning: Robust knowledge unlearning and editing via mechanistic localization, 2024

  15. [15]

    Intrinsic test of unlearning using parametric knowledge traces, 2025

    Yihuai Hong, Lei Yu, Haiqin Yang, Shauli Ravfogel, and Mor Geva. Intrinsic test of unlearning using parametric knowledge traces, 2025

  16. [16]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models, 2021

  17. [17]

    On effects of steering latent representation for large language model unlearning, 2025

    Dang Huu-Tien, Trung-Tin Pham, Hoang Thanh-Tung, and Naoya Inoue. On effects of steering latent representation for large language model unlearning, 2025

  18. [18]

    Editing models with task arithmetic, 2023

    Gabriel Ilharco, Marco Tulio Ribeiro, Mitchell Wortsman, Suchin Gururangan, Ludwig Schmidt, Hannaneh Hajishirzi, and Ali Farhadi. Editing models with task arithmetic, 2023

  19. [19]

    Knowledge unlearning for mitigating privacy risks in language models, 2022

    Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, and Minjoon Seo. Knowledge unlearning for mitigating privacy risks in language models, 2022

  20. [20]

    Reversing the forget-retain objectives: An efficient llm unlearning framework from logit difference

    Jiabao Ji, Yujian Liu, Yang Zhang, Gaowen Liu, Ramana Rao Kompella, Sijia Liu, and Shiyu Chang. Reversing the forget-retain objectives: An efficient llm unlearning framework from logit difference. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (eds.), Advances in Neural Information Processing Systems, volume 37, pp.\ ...

  21. [21]

    Wagle: Strategic weight attribution for effective and modular unlearning in large language models

    Jinghan Jia, Jiancheng Liu, Yihua Zhang, Parikshit Ram, Nathalie Baracaldo, and Sijia Liu. Wagle: Strategic weight attribution for effective and modular unlearning in large language models. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (eds.), Advances in Neural Information Processing Systems, volume 37, pp.\ 55620--...

  22. [22]

    Spanos, and Dawn Song

    Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Bo Li, Ce Zhang, Costas J. Spanos, and Dawn Song. Efficient task-specific data valuation for nearest neighbor algorithms, 2020

  23. [23]

    Rwku: Benchmarking real-world knowledge unlearning for large language models

    Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan, Jiachun Li, Yubo Chen, Kang Liu, and Jun Zhao. Rwku: Benchmarking real-world knowledge unlearning for large language models. Advances in Neural Information Processing Systems, 37: 0 98213--98263, 2024

  24. [24]

    Preserving privacy through dememorization: An unlearning technique for mitigating memorization risks in language models

    Aly Kassem, Omar Mahmoud, and Sherif Saad. Preserving privacy through dememorization: An unlearning technique for mitigating memorization risks in language models. In Houda Bouamor, Juan Pino, and Kalika Bali (eds.), Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.\ 4360--4379, Singapore, December 2023. Associati...

  25. [25]

    Understanding black-box predictions via influence functions

    Pang Wei Koh and Percy Liang. Understanding black-box predictions via influence functions. In Doina Precup and Yee Whye Teh (eds.), Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp.\ 1885--1894. PMLR, 06--11 Aug 2017

  26. [26]

    Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models, 2024

    Yongchan Kwon, Eric Wu, Kevin Wu, and James Zou. Datainf: Efficiently estimating data influence in lora-tuned llms and diffusion models, 2024

  27. [27]

    Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B

    Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel ...

  28. [28]

    ROUGE : A package for automatic evaluation of summaries

    Chin-Yew Lin. ROUGE : A package for automatic evaluation of summaries. In Text Summarization Branches Out, pp.\ 74--81, Barcelona, Spain, July 2004. Association for Computational Linguistics

  29. [29]

    Token-wise influential training data retrieval for large language models

    Huawei Lin, Jikai Long, Zhaozhuo Xu, and Weijie Zhao. Token-wise influential training data retrieval for large language models. In Lun - Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024 , pp.\ 84...

  30. [30]

    Continual learning and private unlearning

    Bo Liu, Qiang Liu, and Peter Stone. Continual learning and private unlearning. In Sarath Chandar, Razvan Pascanu, and Doina Precup (eds.), Proceedings of The 1st Conference on Lifelong Learning Agents, volume 199 of Proceedings of Machine Learning Research, pp.\ 243--254. PMLR, 22--24 Aug 2022

  31. [31]

    Large language model unlearning via embedding-corrupted prompts

    Chris Yuhao Liu, Yaxuan Wang, Jeffrey Flanigan, and Yang Liu. Large language model unlearning via embedding-corrupted prompts. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (eds.), Advances in Neural Information Processing Systems, volume 37, pp.\ 118198--118266. Curran Associates, Inc., 2024 a

  32. [32]

    Towards safer large language models through machine unlearning, 2024 b

    Zheyuan Liu, Guangyao Dou, Zhaoxuan Tan, Yijun Tian, and Meng Jiang. Towards safer large language models through machine unlearning, 2024 b

  33. [33]

    Quark: Controllable text generation with reinforced unlearning

    Ximing Lu, Sean Welleck, Jack Hessel, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, and Yejin Choi. Quark: Controllable text generation with reinforced unlearning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (eds.), Advances in Neural Information Processing Systems, volume 35, pp.\ 27591--27609. Curran Associates, ...

  34. [34]

    A unified approach to interpreting model predictions, 2017

    Scott Lundberg and Su-In Lee. A unified approach to interpreting model predictions, 2017

  35. [35]

    On the generalized distance in statistics

    Prasanta Chandra Mahalanobis. On the generalized distance in statistics. Proceedings of the National Institute of Sciences (Calcutta), 2: 0 49--55, 1936

  36. [36]

    Lipton, and J

    Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, and J. Zico Kolter. Tofu: A task of fictitious unlearning for llms, 2024

  37. [37]

    Simpo: Simple preference optimization with a reference-free reward, 2024

    Yu Meng, Mengzhou Xia, and Danqi Chen. Simpo: Simple preference optimization with a reference-free reward, 2024

  38. [38]

    A survey of machine unlearning, 2024

    Thanh Tam Nguyen, Thanh Trung Huynh, Zhao Ren, Phi Le Nguyen, Alan Wee-Chung Liew, Hongzhi Yin, and Quoc Viet Hung Nguyen. A survey of machine unlearning, 2024

  39. [39]

    Team OLMo, Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William ...

  40. [40]

    OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner...

  41. [41]

    Alinfik: Learning to approximate linearized future influence kernel for scalable third-party LLM data valuation

    Yanzhou Pan, Huawei Lin, Yide Ran, Jiamin Chen, Xiaodong Yu, Weijie Zhao, Denghui Zhang, and Zhaozhuo Xu. Alinfik: Learning to approximate linearized future influence kernel for scalable third-party LLM data valuation. In Luis Chiruzzo, Alan Ritter, and Lu Wang (eds.), Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Associ...

  42. [42]

    In-context unlearning: Language models as few shot unlearners, 2024

    Martin Pawelczyk, Seth Neel, and Himabindu Lakkaraju. In-context unlearning: Language models as few shot unlearners, 2024

  43. [43]

    Estimating training data influence by tracing gradient descent

    Garima Pruthi, Frederick Liu, Satyen Kale, and Mukund Sundararajan. Estimating training data influence by tracing gradient descent. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp.\ 19920--19930. Curran Associates, Inc., 2020

  44. [44]

    Direct preference optimization: Your language model is secretly a reward model

    Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. Direct preference optimization: Your language model is secretly a reward model. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (eds.), Advances in Neural Information Processing Systems, pp.\ 53728--53741. Curran Associates, Inc., 2023

  45. [45]

    why should i trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?": Explaining the predictions of any classifier, 2016

  46. [46]

    S. E. Robertson and S. Walker. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Bruce W. Croft and C. J. van Rijsbergen (eds.), SIGIR '94, pp.\ 232--241, London, 1994. Springer London. ISBN 978-1-4471-2099-5

  47. [47]

    Smith, and Chiyuan Zhang

    Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A. Smith, and Chiyuan Zhang. Muse: Machine unlearning six-way evaluation for language models, 2024

  48. [48]

    Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah A

    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander...

  49. [49]

    Axiomatic attribution for deep networks, 2017

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks, 2017

  50. [50]

    Improvements to bm25 and language models examined

    Andrew Trotman, Antti Puurula, and Blake Burgess. Improvements to bm25 and language models examined. In Proceedings of the 19th Australasian Document Computing Symposium, ADCS '14, pp.\ 58–65, New York, NY, USA, 2014. Association for Computing Machinery. ISBN 9781450330008

  51. [51]

    Rkld: Reverse kl-divergence-based knowledge distillation for unlearning personal information in large language models, 2024

    Bichen Wang, Yuzhe Zi, Yixin Sun, Yanyan Zhao, and Bing Qin. Rkld: Reverse kl-divergence-based knowledge distillation for unlearning personal information in large language models, 2024

  52. [52]

    Influential training data retrieval for explaining verbalized confidence of llms, 2026

    Yuxi Xia, Loris Schoenegger, and Benjamin Roth. Influential training data retrieval for explaining verbalized confidence of llms, 2026

  53. [53]

    Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, and Philip S. Yu. Machine unlearning: A survey. ACM Comput. Surv., 56 0 (1), August 2023. ISSN 0360-0300

  54. [54]

    Machine unlearning of pre-trained large language models, 2024

    Jin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang, Zezhou Cheng, and Xiang Yue. Machine unlearning of pre-trained large language models, 2024

  55. [55]

    Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, and Pradeep Ravikumar. Representer point selection for explaining deep neural networks, 2018

  56. [56]

    Gradient ascent post-training enhances language model generalization, 2023

    Dongkeun Yoon, Joel Jang, Sungdong Kim, and Minjoon Seo. Gradient ascent post-training enhances language model generalization, 2023

  57. [57]

    Negative preference optimization: From catastrophic collapse to effective unlearning, 2024

    Ruiqi Zhang, Licong Lin, Yu Bai, and Song Mei. Negative preference optimization: From catastrophic collapse to effective unlearning, 2024

  58. [58]

    Decoupling the class label and the target concept in machine unlearning, 2024

    Jianing Zhu, Bo Han, Jiangchao Yao, Jianliang Xu, Gang Niu, and Masashi Sugiyama. Decoupling the class label and the target concept in machine unlearning, 2024

  59. [59]

    Zitzler and L

    E. Zitzler and L. Thiele. Multiobjective evolutionary algorithms: a comparative case study and the strength pareto approach. IEEE Transactions on Evolutionary Computation, 3 0 (4): 0 257--271, 1999

  60. [60]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  61. [61]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  62. [62]

    Robertson, S. E. and Walker, S

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...