A Counterfactual Explanation Framework for Retrieval Models
Pith reviewed 2026-05-23 20:46 UTC · model grok-4.3
The pith
A counterfactual method identifies terms to add to a document that would improve its rank for a query.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce a counterfactual explanation framework for retrieval models that determines the terms that need to be added to a document to improve its ranking with respect to a given query. This identifies the absence of which words affects the ranking, providing an explanation for why the document was not favored by the model.
What carries the argument
The counterfactual framework that generates hypothetical term additions to improve ranking scores.
Load-bearing premise
Identifying terms whose addition would improve ranking gives a valid explanation of why the original document was disfavored, without direct model access or external checks on the counterfactuals.
What would settle it
Apply the generated term additions to the original document and measure whether the retrieval model actually produces the predicted higher rank.
Figures
read the original abstract
Explainability has become a crucial concern in today's world, aiming to enhance transparency in machine learning and deep learning models. Information retrieval is no exception to this trend. In existing literature on explainability of information retrieval, the emphasis has predominantly been on illustrating the concept of relevance concerning a retrieval model. The questions addressed include why a document is relevant to a query, why one document exhibits higher relevance than another, or why a specific set of documents is deemed relevant for a query. However, limited attention has been given to understanding why a particular document is not favored (e.g., not within top-K) with respect to a query and a retrieval model. In an effort to address this gap, our work focuses on the question of what terms need to be added within a document to improve its ranking. This, in turn, answers the question of which words in the document played a role in not being favored by a retrieval model for a particular query. We use a counterfactual framework to solve the above-mentioned research problem. % To the best of our knowledge, we mark the first attempt to tackle this specific counterfactual problem (i.e. examining the absence of which words can affect the ranking of a document). Our experiments show the effectiveness of our proposed approach in predicting counterfactuals for both statistical (e.g. BM25) and deep-learning-based models (e.g. DRMM, DSSM, ColBERT, MonoT5).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a counterfactual explanation framework for retrieval models that identifies terms whose addition to a document would improve its ranking for a query, thereby explaining why the document was originally disfavored. It claims to be the first work addressing this specific problem and reports experiments demonstrating effectiveness on both statistical models (e.g., BM25) and neural models (DRMM, DSSM, ColBERT, MonoT5).
Significance. If the counterfactuals are faithful, the work would address a genuine gap in IR explainability by focusing on disfavor rather than relevance. The breadth of models tested is a strength. The significance is limited by the absence of evidence that addition-based changes provide valid inverse explanations of the original decision, especially for non-linear models.
major comments (2)
- [Abstract] Abstract: the central claim equates finding terms whose addition raises rank with identifying words responsible for original disfavor. This requires the counterfactual mapping to be faithful (precise inverse of disfavoring factors), but no perturbation symmetry tests, model-internal access, or external validation (human/automated) are described to support this for non-linear models where interactions are not additive.
- [Experiments] Experiments section: effectiveness is asserted for DRMM, ColBERT, and MonoT5, but without reported metrics on faithfulness, comparison to baselines for explanation quality, or controls for whether the added terms simply describe a different high-scoring document rather than explaining the original ranking, the results do not establish the framework's validity.
minor comments (1)
- The abstract would be clearer with a one-sentence outline of the algorithmic procedure used to generate the counterfactual terms.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below, clarifying the scope of our claims and indicating where revisions will strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim equates finding terms whose addition raises rank with identifying words responsible for original disfavor. This requires the counterfactual mapping to be faithful (precise inverse of disfavoring factors), but no perturbation symmetry tests, model-internal access, or external validation (human/automated) are described to support this for non-linear models where interactions are not additive.
Authors: We agree that the mapping from addition-based counterfactuals to explanations of original disfavor is not guaranteed to be a precise inverse for non-linear models. Our framework defines the explanation explicitly as the minimal terms whose absence caused the low rank (i.e., the terms that, when added, produce a measurable rank improvement). This is a directional counterfactual rather than a symmetric perturbation. We did not claim model-internal access or perform symmetry tests because the approach is model-agnostic and applies to black-box rankers. We will revise the abstract and introduction to state the claim more precisely as “terms whose addition improves rank, thereby highlighting factors contributing to the original disfavor” and add a limitations paragraph discussing the non-invertibility issue for non-linear models. revision: partial
-
Referee: [Experiments] Experiments section: effectiveness is asserted for DRMM, ColBERT, and MonoT5, but without reported metrics on faithfulness, comparison to baselines for explanation quality, or controls for whether the added terms simply describe a different high-scoring document rather than explaining the original ranking, the results do not establish the framework's validity.
Authors: The primary effectiveness metric reported is the achieved rank improvement after term addition, which directly measures whether the identified terms address the original low ranking. We will add explicit faithfulness metrics (e.g., rank delta before/after addition, comparison against random term addition and against terms drawn from top-ranked documents) in the revised experiments section. We will also include a control experiment that verifies the added terms are query-specific rather than generic high-scoring document descriptors. These additions will be reported for all models, including the neural ones. revision: yes
Circularity Check
No circularity: novel counterfactual framework with no self-referential derivations or fitted predictions
full rationale
The paper proposes a new counterfactual method to identify terms whose addition would improve a document's rank, framing this as an explanation for original disfavor. No equations, fitted parameters, or derivation chains are present in the abstract or described approach that reduce outputs to inputs by construction. The claim of being the 'first attempt' is a novelty assertion, not a load-bearing self-citation or self-definition. The central mapping from addition-based counterfactuals to explanations is an unvalidated modeling assumption (correctness issue) rather than a circular reduction. The work remains self-contained as a methodological proposal without the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Avishek Anand, Lijun Lyu, Maximilian Idahl, Yumeng Wang, Jonas Wallat, and Zijian Zhang. 2022. Explainable Information Retrieval: A Survey
work page 2022
-
[2]
Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In InCoCo@NIPS
work page 2016
-
[3]
Alexander Bondarenko, Maik Fröbe, Jan Heinrich Reimer, Benno Stein, Michael Völske, and Matthias Hagen. 2022. Axiomatic Retrieval Experimentation with ir_axioms. In Proc. of SIGIR 2022 . 3131–3140
work page 2022
-
[4]
Miguel Á Carreira-Perpiñán and Suryabhan Singh Hada. 2021. Counterfactual explanations for oblique decision trees: Exact, efficient algorithms. InProceedings of the AAAI conference on artificial intelligence , Vol. 35. 6903–6911
work page 2021
- [5]
-
[6]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Jimmy Lin, Ellen M. Voorhees, and Ian Soboroff. 2023. Overview of the TREC 2022 deep learning track. In Text REtrieval Conference (TREC). NIST, TREC
work page 2023
-
[7]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT
work page 2019
-
[8]
Gokhan Egri and Coskun Bayrak. 2014. The Role of Search Engine Optimization on Keeping the User on the Site. Procedia Computer Science 36 (2014), 335–
work page 2014
-
[9]
https://doi.org/10.1016/j.procs.2014.09.102 Complex Adaptive Systems Philadelphia, PA November 3-5, 2014
-
[10]
Anett Erdmann, Ramón Arilla, and José M. Ponzoa. 2022. Search engine opti- mization: The long-term strategy of keyword choice. Journal of Business Research 144 (2022), 650–662. https://doi.org/10.1016/j.jbusres.2022.01.065
-
[11]
Maarten Grootendorst. 2020. KeyBERT: Minimal keyword extraction with BERT. https://doi.org/10.5281/zenodo.4461265
-
[12]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. InProceedings of the 25th ACM International on Conference on Information and Knowledge Management (Indianapolis, Indiana, USA) (CIKM ’16). Association for Computing Machinery, New York, NY, USA, 55–64
work page 2016
-
[13]
Jiafeng Guo, Yixing Fan, Xiang Ji, and Xueqi Cheng. 2019. MatchZoo: A Learning, Practicing, and Developing System for Neural Text Matching. In Proceedings of the 42Nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’19). 1297–1300
work page 2019
-
[14]
Faisal Hamman, Erfaun Noorani, Saumitra Mishra, Daniele Magazzeni, and Sang- hamitra Dutta. 2023. Robust counterfactual explanations for neural networks with probabilistic guarantees. In Proceedings of the 40th International Conference on Machine Learning (Honolulu, Hawaii, USA) (ICML’23). JMLR.org, Article 499, 17 pages
work page 2023
-
[15]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM International Conference on Information & Knowledge Management (San Francisco, California, USA) (CIKM ’13). 2333–2338
work page 2013
-
[16]
Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, Yuichi Ike, Kento Uemura, and Hiroki Arimura. 2021. Ordered counterfactual explanation by mixed-integer linear optimization. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 11564–11574
work page 2021
-
[17]
Amir-Hossein Karimi, Gilles Barthe, Borja Balle, and Isabel Valera. 2020. Model- agnostic counterfactual explanations for consequential decisions. InInternational Conference on Artificial Intelligence and Statistics . PMLR, 895–905
work page 2020
-
[18]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 39–48
work page 2020
-
[19]
Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible Infor- mation Retrieval Research with Sparse and Dense Representations. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2356–2362
work page 2021
-
[20]
Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, and Xueqi Cheng. 2023. Black-box Adversarial Attacks against Dense Retrieval Models: A Multi-view Contrastive Learning Method. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (Birm- ingham, United Kingdom) (CIKM ’23). Association f...
work page 2023
-
[21]
Lijun Lyu and Avishek Anand. 2023. Listwise Explanations for Ranking Models Using Multiple Explainers. In Advances in Information Retrieval: 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2–6, 2023, Proceedings, Part I (<conf-loc content-type="InPerson">Dublin, Ireland</conf- loc>). Springer-Verlag, Berlin, Heidelberg...
work page 2023
-
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In NeurIPS
work page 2013
-
[23]
Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. 2020. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency . 607–617
work page 2020
-
[24]
Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Docu- ment Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020 , Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 708–718
work page 2020
-
[25]
Axel Parmentier and Thibaut Vidal. 2021. Optimal counterfactual explanations in tree ensembles. In International conference on machine learning . PMLR, 8422– 8431
work page 2021
-
[26]
Martin Pawelczyk, Chirag Agarwal, Shalmali Joshi, Sohini Upadhyay, and Himabindu Lakkaraju. 2022. Exploring counterfactual explanations through the lens of adversarial examples: A theoretical and empirical analysis. In Interna- tional Conference on Artificial Intelligence and Statistics . PMLR, 4574–4594. Conference’17, July 2017, Washington, DC, USA Bhav...
work page 2022
-
[27]
Judea Pearl. 2018. Theoretical impediments to machine learning with seven sparks from the causal revolution. arXiv preprint arXiv:1801.04016 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[28]
Gustavo Penha, Eyal Krikon, and Vanessa Murdock. 2022. Pairwise review- based explanations for voice product search. In ACM SIGIR Conference on Human Information Interaction and Retrieval . 300–304
work page 2022
-
[29]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. InProc.of SIGKDD 2016. 1135–1144
work page 2016
-
[30]
Jaspreet Singh and Avishek Anand. 2019. EXS: Explainable Search Using Local Model Agnostic Interpretability. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia) (WSDM ’19). 770–773
work page 2019
-
[31]
Jaspreet Singh and Avishek Anand. 2020. Model agnostic interpretability of rankers via intent modelling. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 618–628
work page 2020
-
[32]
Arnaud Van Looveren and Janis Klaise. 2021. Interpretable counterfactual expla- nations guided by prototypes. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 650–665
work page 2021
-
[33]
Ellen Voorhees. 2005. Overview of the TREC 2004 Robust Retrieval Track. https: //doi.org/10.6028/NIST.SP.500-261
- [34]
-
[35]
Chen Wu, Ruqing Zhang, Jiafeng Guo, Yixing Fan, and Xueqi Cheng. 2022. Are Neural Ranking Models Robust? ACM Trans. Inf. Syst. 41, 2, Article 29 (dec 2022), 36 pages
work page 2022
- [36]
-
[37]
Puxuan Yu, Razieh Rahimi, and James Allan. 2022. Towards explainable search results: a listwise explanation generator. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval . 669–680. 7 APPENDIX 7.1 Retrieval Performance of IR Models We use Lin et al. [18] toolkit for implementing BM25 and MonoT...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.