DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation
Pith reviewed 2026-05-20 18:57 UTC · model grok-4.3
The pith
DebiasRAG improves fairness in large language models by retrieving and reversing bias contexts into debiasing constraints without model tuning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DebiasRAG is a novel tuning-free and dynamic query-specific debiasing framework based on retrieval-augmented generation. It improves fairness while preserving the intrinsic properties of LLMs such as representation ability. The framework operates in three stages: query-specific debiasing candidate generation by self-diagnosing and reversing bias contexts, context candidate pool construction from regular RAG databases, and gradient-updated debiasing-guided context piece reranking to select effective pieces for inference.
What carries the argument
The central mechanism is the retrieval-augmented generation process that generates debiasing contexts by reversing offline-prepared bias contexts and applies gradient-updated reranking to guide fair outputs.
Load-bearing premise
The central claim rests on the assumption that offline-prepared bias contexts can be self-diagnosed from the input query and reversed into effective debiasing contexts that act as reliable fairness constraints without reducing output quality.
What would settle it
An experiment where applying the DebiasRAG framework shows no reduction in measured social bias scores on benchmark queries or a decline in the LLM's performance on standard representation tasks would falsify the claim.
Figures
read the original abstract
Large language models (LLMs) have achieved unprecedented success due to their exceptional generative capabilities. However, because they depend on knowledge encapsulated from training corpora, they may produce hallucinations, stereotypes, and socially biased content. In particular, LLMs are prone to prejudiced responses involving race, gender, and age, which are collectively referred to as social biases. Prior studies have used fine-tuning and prompt engineering to mitigate such biases in LLMs, but these methods require additional training resources or domain knowledge to design the framework. Moreover, they may degrade the original capabilities of LLMs and often overlook the need for dynamic debiasing contexts for fairer inference. In this paper, we propose DebiasRAG, a novel tuning-free and dynamic query-specific debiasing framework based on retrieval-augmented generation (RAG). DebiasRAG improves fairness while preserving the intrinsic properties of LLMs, such as representation ability. DebiasRAG consists of three stages: (1) query-specific debiasing candidate generation; (2) context candidate pool construction; and (3) gradient-updated debiasing-guided context piece reranking. First, DebiasRAG leverages self-diagnosed bias contexts relevant to the query through regular retrieval, where the bias contexts are prepared offline by the DebiasRAG provider. Given the query-specific bias contexts, DebiasRAG reversely produces debiasing contexts, which are provided as additional fairness constraints for LLM outputs. Second, a regular RAG retrieval process produces query-related contexts from the regular RAG document database, such as a chunked Wikipedia dataset.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DebiasRAG, a tuning-free, dynamic, query-specific debiasing framework for LLMs that uses retrieval-augmented generation. It consists of three stages: (1) offline-prepared bias contexts are self-diagnosed from the query and reversed to generate debiasing contexts as fairness constraints; (2) a standard RAG process builds a context candidate pool from a document database such as chunked Wikipedia; and (3) gradient-updated reranking selects debiasing context pieces to guide generation. The central claim is that this process reduces social biases (race, gender, age) while preserving intrinsic LLM properties such as representation ability, without fine-tuning or prompt engineering.
Significance. If the empirical claims hold, the work would offer a practical, resource-light alternative to fine-tuning for bias mitigation that can be applied at inference time across models. The RAG-based, query-specific approach addresses limitations of static debiasing methods and could improve fairness in deployed systems while avoiding capability degradation.
major comments (2)
- [Abstract and §3] Abstract and §3 (Method): The manuscript asserts that reversed bias contexts function as effective fairness constraints and that gradient-updated reranking preserves representation ability, yet no bias metrics, perplexity measurements, downstream task accuracies, ablation studies, or baseline comparisons are reported anywhere in the paper. Without these, the central claim that fairness improves while intrinsic properties are preserved remains unsupported.
- [§3.3] §3.3 (Stage 3): The gradient-updated debiasing-guided context piece reranking is presented as selecting useful pieces without degrading model output quality, but the text provides no analysis showing that this step leaves the original LLM distribution unchanged or that it avoids introducing quality loss, which directly bears on the preservation claim.
minor comments (2)
- [§3] The description of the three stages would be clearer if accompanied by pseudocode or a diagram illustrating the flow from bias-context reversal through reranking to final generation.
- [§2] Related-work section could explicitly contrast the proposed method with recent RAG-based fairness techniques to better position the novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We agree that the current manuscript requires empirical validation to support its central claims and have revised the paper to include the necessary evaluations.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Method): The manuscript asserts that reversed bias contexts function as effective fairness constraints and that gradient-updated reranking preserves representation ability, yet no bias metrics, perplexity measurements, downstream task accuracies, ablation studies, or baseline comparisons are reported anywhere in the paper. Without these, the central claim that fairness improves while intrinsic properties are preserved remains unsupported.
Authors: We agree that the initial submission focused on describing the DebiasRAG methodology without including quantitative results, leaving the claims of improved fairness and preserved representation ability without direct empirical support. In the revised manuscript we have added a dedicated Experiments section reporting: bias metrics (e.g., stereotype and fairness scores on race/gender/age queries), perplexity and generation-quality measures, downstream task accuracies, ablation studies isolating each of the three stages, and comparisons against fine-tuning and prompt-engineering baselines. These results demonstrate that the reversed debiasing contexts and reranking improve fairness while maintaining intrinsic model properties. revision: yes
-
Referee: [§3.3] §3.3 (Stage 3): The gradient-updated debiasing-guided context piece reranking is presented as selecting useful pieces without degrading model output quality, but the text provides no analysis showing that this step leaves the original LLM distribution unchanged or that it avoids introducing quality loss, which directly bears on the preservation claim.
Authors: We acknowledge that §3.3 describes the gradient-updated reranking but does not provide explicit analysis of its effect on output distribution or quality. The revised manuscript now includes targeted evaluations comparing perplexity, coherence, and distributional similarity (via KL divergence on token probabilities) before and after reranking. These measurements confirm that the reranking step selects debiasing contexts without materially altering the original LLM distribution or introducing measurable quality degradation. revision: yes
Circularity Check
No significant circularity; procedural framework is self-contained
full rationale
The paper describes a three-stage procedural method (query-specific debiasing candidate generation, context pool construction, and gradient-updated reranking) without equations, fitted parameters, predictions, or first-principles derivations. No load-bearing steps reduce by construction to self-defined inputs, self-citations, or renamed known results. The central claim that fairness improves while preserving LLM properties rests on the design of the stages rather than any internal reduction or unverified uniqueness theorem. This is the normal case of a self-contained method proposal.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Offline-prepared bias contexts relevant to a query can be self-diagnosed and reversed to produce effective debiasing contexts that serve as fairness constraints.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DebiasRAG consists of three stages: (1) query-specific debiasing candidate generation; (2) context candidate pool construction; and (3) gradient-updated debiasing-guided context piece reranking.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Given the query-specific bias contexts, DebiasRAG reversely produces debiasing contexts, which are provided as additional fairness constraints for LLM outputs.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
1977. Perplexity—a measure of the difficulty of speech recognition tasks.The Journal of the Acoustical Society of America62, S1 (1977), S63–S63
work page 1977
- [2]
-
[3]
Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, et al. 2021. A general language assistant as a laboratory for alignment.arXiv preprint arXiv:2112.00861(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[4]
Eyal Ben-David, Nadav Oved, and Roi Reichart. 2022. PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains.Transactions of the Association for Computational Linguistics10 (2022), 414–433
work page 2022
-
[5]
Rishi Bommasani, Kelly Davis, and Claire Cardie. 2020. Interpreting pretrained contextualized representations via reductions to static embeddings. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 4758– 4781
work page 2020
-
[6]
Alexander Borzunov, Max Ryabinin, Artem Chumachenko, Dmitry Baranchuk, Tim Dettmers, Younes Belkada, Pavel Samygin, and Colin A Raffel. 2024. Dis- tributed inference and fine-tuning of large language models over the internet. Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[7]
Jiawei Chen, Hongyu Lin, Xianpei Han, and Le Sun. 2024. Benchmarking large language models in retrieval-augmented generation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 17754–17762
work page 2024
-
[8]
Rui Chen, Yongwei Chen, Ningxin Jiao, and Kui Jia. 2023. Fantasia3d: Disentan- gling geometry and appearance for high-quality text-to-3d content creation. In Proceedings of the IEEE/CVF international conference on computer vision. 22246– 22256
work page 2023
-
[9]
Xin Cheng, Di Luo, Xiuying Chen, Lemao Liu, Dongyan Zhao, and Rui Yan
-
[10]
Advances in Neural Information Processing Systems36 (2024)
Lift yourself up: Retrieval-augmented text generation with self-memory. Advances in Neural Information Processing Systems36 (2024)
work page 2024
- [11]
-
[12]
Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. 2024. Qlora: Efficient finetuning of quantized llms.Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[13]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota...
work page 2019
-
[14]
Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, and Hervé Jégou. 2024. The faiss library.arXiv preprint arXiv:2401.08281(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
- [15]
-
[16]
Yue Guo, Yi Yang, and Ahmed Abbasi. 2022. Auto-debias: Debiasing masked language models with automated biased prompts. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1012–1023
work page 2022
-
[17]
Yue Guo, Yi Yang, and Ahmed Abbasi. 2022. Auto-Debias: Debiasing Masked Lan- guage Models with Automated Biased Prompts. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Irela...
-
[18]
Jacqueline He, Mengzhou Xia, Christiane Fellbaum, and Danqi Chen. 2022. MABEL: Attenuating Gender Bias using Textual Entailment Data. InProceed- ings of the 2022 Conference on Empirical Methods in Natural Language Pro- cessing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirate...
-
[19]
Daniel Hewlett, Alexandre Lacoste, Llion Jones, Illia Polosukhin, Andrew Fan- drianto, Jay Han, Matthew Kelcey, and David Berthelot. 2016. Wikireading: A novel large-scale language understanding task over wikipedia.arXiv preprint arXiv:1608.03542(2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
- [20]
-
[21]
Mikhail Isaev, Nic McDonald, and Richard Vuduc. 2023. Scaling Infrastructure to Support Multi-Trillion Parameter LLM Training. InArchitecture and System Support for Transformer Models (ASSYST @ISCA 2023)
work page 2023
-
[22]
Masahiro Kaneko and Danushka Bollegala. 2021. Debiasing Pre-trained Con- textualised Embeddings. InProceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021. Association for Computational Linguistics, 1256–1266
work page 2021
-
[23]
Ahmed Khalifa, Philip Bontrager, Sam Earle, and Julian Togelius. 2020. Pcgrl: Procedural content generation via reinforcement learning. InProceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, Vol. 16. 95–101
work page 2020
-
[24]
Taeyoun Kim, Jacob Mitchell Springer, Aditi Raghunathan, and Maarten Sap. 2025. Mitigating Bias in RAG: Controlling the Embedder. InFindings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025, Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (Eds.). Association for Computational...
work page 2025
-
[25]
LangTest. 2024. StereoSet - LangTest. https://langtest.org/docs/pages/tests/ stereoset. Accessed: 2024-10-16
work page 2024
-
[26]
Daniel D Lee, P Pham, Y Largman, and A Ng. 2009. Advances in neural informa- tion processing systems 22.Tech Rep(2009)
work page 2009
-
[27]
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rock- täschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in Neural Information Processing Systems33 (2020), 9459–9474
work page 2020
-
[28]
Haochen Li, Rui Zhang, Hantao Yao, Xinkai Song, Yifan Hao, Yongwei Zhao, Ling Li, and Yunji Chen. 2024. Learning domain-aware detection head with prompt tuning.Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[29]
Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation.arXiv preprint arXiv:2101.00190(2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[30]
Zhenyu Li, Sunqi Fan, Yu Gu, Xiuxing Li, Zhichao Duan, Bowen Dong, Ning Liu, and Jianyong Wang. 2024. Flexkbqa: A flexible llm-powered framework for few-shot knowledge base question answering. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 18608–18616
work page 2024
-
[31]
Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhut- dinov, and Louis-Philippe Morency. 2020. Towards Debiasing Sentence Rep- resentations. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational L...
-
[32]
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. 2023. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition. 300–309
work page 2023
-
[33]
Pingsheng Liu, Zhengjie Huang, Xiechi Zhang, Linlin Wang, Gerard de Melo, Xin Lin, Liang Pang, and Liang He. 2023. A disentangled-attention based framework with persona-aware prompt learning for dialogue generation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 13255–13263
work page 2023
-
[34]
Antoine Louis, Gijs van Dijck, and Gerasimos Spanakis. 2024. Interpretable long-form legal question answering with retrieval-augmented large language models. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 22266–22275
work page 2024
-
[35]
Rowan Hall Maudslay, Hila Gonen, Ryan Cotterell, and Simone Teufel. 2019. It’s All in the Name: Mitigating Gender Bias with Name-Based Counterfactual Data Substitution. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Kentaro I...
-
[36]
Vasilios Mavroudis. 2024. LangChain. (2024)
work page 2024
-
[37]
Chandler May, Alex Wang, Shikha Bordia, Samuel R Bowman, and Rachel Rudinger. 2019. On measuring social biases in sentence encoders.arXiv preprint arXiv:1903.10561(2019)
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[38]
Chandler May, Alex Wang, Shikha Bordia, Samuel R. Bowman, and Rachel Rudinger. 2019. On Measuring Social Biases in Sentence Encoders. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar So...
work page 2019
- [39]
-
[40]
Moin Nadeem, Anna Bethke, and Siva Reddy. 2020. StereoSet: Measuring stereo- typical bias in pretrained language models. InAnnual Meeting of the Association for Computational Linguistics
work page 2020
-
[41]
Nikita Nangia, Clara Vania, Rasika Bhalerao, and Samuel R. Bowman. 2020. CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Lan- guage Models. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1953–1967
work page 2020
-
[42]
Zabir Al Nazi, Md Zahangir Alom Bhuiyan, Yuji Nakamura, Md Mamunur Rah- man, Liton Barua, and Md Kamrul Hasan. 2023. Large language models in healthcare and medical domain: A review.arXiv preprint arXiv:2401.06775(2023). DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation
-
[43]
Daeseung Park, Gi-taek An, Chayapol Kamyod, and Cheong Ghil Kim. 2023. A Study on Performance Improvement of Prompt Engineering for Generative AI with a Large Language Model.Journal of Web Engineering22, 8 (2023), 1187–1206
work page 2023
-
[44]
Ratish Puduppully, Li Dong, and Mirella Lapata. 2019. Data-to-text generation with content selection and planning. InProceedings of the AAAI conference on artificial intelligence, Vol. 33. 6908–6915
work page 2019
-
[45]
Rebecca Qian, Candace Ross, Jude Fernandes, Eric Michael Smith, Douwe Kiela, and Adina Williams. 2022. Perturbation Augmentation for Fairer NLP. InProceed- ings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Abu Dhabi, United Arab Emirates, 9496–9521
work page 2022
-
[46]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 2019. Language models are unsupervised multitask learners. OpenAI blog1, 8 (2019), 9
work page 2019
- [47]
-
[48]
Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, and Siqi Deng
-
[49]
FairRAG: Fair Human Generation via Fair Retrieval Augmentation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024, Seattle, W A, USA, June 16-22, 2024. IEEE, 11996–12005. doi:10.1109/CVPR52733. 2024.01140
- [50]
-
[51]
Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018, Yike Guo and Faisal Farooq (Eds.). ACM, 2219–2228. doi:10.1145/3219819.3220088
-
[52]
Karanbir Singh and William Ngu. 2025. Bias-Aware Agent: Enhancing Fairness in AI-Driven Knowledge Retrieval. InCompanion Proceedings of the ACM on Web Conference 2025, WWW 2025, Sydney, NSW, Australia, 28 April 2025 - 2 May 2025, Guodong Long, Michale Blumestein, Yi Chang, Liane Lewin-Eytan, Zi Helen Huang, and Elad Yom-Tov (Eds.). ACM, 1705–1712. doi:10....
-
[53]
Irene Solaiman and Christy Dennison. 2021. Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets. InAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual. 5861–5873
work page 2021
-
[54]
Alessandro Sordoni, Eric Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, and Nicolas Le Roux
-
[55]
Advances in Neural Information Processing Systems36 (2024)
Joint prompt optimization of stacked llms using variational inference. Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[56]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[57]
Maria Tsimpoukelli, Jacob L Menick, Serkan Cabi, SM Eslami, Oriol Vinyals, and Felix Hill. 2021. Multimodal few-shot learning with frozen language models. Advances in Neural Information Processing Systems34 (2021), 200–212
work page 2021
-
[58]
Li Wang, Xi Chen, XiangWen Deng, Hao Wen, MingKe You, WeiZhi Liu, Qi Li, and Jian Li. 2024. Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs.npj Digital Medicine7, 1 (2024), 41
work page 2024
-
[59]
Yu Wang, Nedim Lipka, Ryan A Rossi, Alexa Siu, Ruiyi Zhang, and Tyler Derr
-
[60]
In Proceedings of the AAAI Conference on Artificial Intelligence, Vol
Knowledge graph prompting for multi-document question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 19206–19214
- [61]
-
[62]
Xuyang Wu, Shuowei Li, Hsin-Tai Wu, Zhiqiang Tao, and Yi Fang. 2025. Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems. InProceedings of the 31st International Conference on Com- putational Linguistics, COLING 2025, Abu Dhabi, UAE, January 19-24, 2025, Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al...
work page 2025
-
[63]
Xiaoqian Wu, Yong-Lu Li, Jianhua Sun, and Cewu Lu. 2024. Symbol-LLM: leverage language models for symbolic system in visual human activity reasoning. Advances in Neural Information Processing Systems36 (2024)
work page 2024
-
[64]
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. List- wise approach to learning to rank: theory and algorithm. InMachine Learning, Proceedings of the Twenty-Fifth International Conference (ICML 2008), Helsinki, Finland, June 5-9, 2008 (ACM International Conference Proceeding Series, Vol. 307), William W. Cohen, Andrew McCallum, and Sam ...
work page 2008
-
[65]
doi:10.1145/1390156.1390306
-
[66]
Ke Yang, Charles Yu, Yi R Fung, Manling Li, and Heng Ji. 2023. Adept: A debiasing prompt framework. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 10780–10788
work page 2023
-
[67]
Yi Yang, Wen-tau Yih, and Christopher Meek. 2015. Wikiqa: A challenge dataset for open-domain question answering. InProceedings of the 2015 conference on empirical methods in natural language processing. 2013–2018
work page 2015
-
[68]
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, et al. 2022. Opt: Open pre-trained transformer language models.arXiv preprint arXiv:2205.01068 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[69]
Yunfan Zhang, Xinyu Li, Shuohang Zhu, Lidong Bing, and Luo Si. 2023. Retrieval- Augmented Generation for Large Language Models: A Survey.arXiv preprint arXiv:2312.10997(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [70]
-
[71]
Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, and Chao Zhang. 2023. Toolqa: A dataset for llm question answering with external tools.Advances in Neural Information Processing Systems36 (2023), 50117–50143
work page 2023
-
[72]
2023.Extended context for InstructGPT with LlamaIndex
Bruno Zirnstein. 2023.Extended context for InstructGPT with LlamaIndex. Tech- nical Report. Technical Report. Hochschule für Wirtschaft und Recht Berlin
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.