From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

arxiv: 2507.07847 · v3 · submitted 2025-07-10 · 💻 cs.CL · cs.AI

From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Youngjoon Jang , Seongtae Hong , Junyoung Son , Sungjin Park , Chanjun Park , Heuiseok Lim This is my paper

Pith reviewed 2026-05-19 05:27 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords coreference resolutionretrieval-augmented generationquestion answeringnatural language processingentity disambiguationlarge language modelsretrieval effectiveness

0 comments p. Extension

The pith

Coreference resolution improves retrieval effectiveness and question-answering performance in RAG systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether resolving references such as pronouns in retrieved documents reduces ambiguity that interferes with how large language models use context in retrieval-augmented generation. It reports gains in both how well documents are retrieved and how accurately questions are answered once coreferences are fixed. A sympathetic reader would care because clearer references could let RAG systems deliver more consistent facts with less hallucination, especially when using smaller models that have less built-in ability to track entities.

Core claim

We demonstrate that coreference resolution enhances retrieval effectiveness and improves question-answering (QA) performance. Through comparative analysis of different pooling strategies in retrieval tasks, we find that mean pooling demonstrates superior context capturing ability after applying coreference resolution. In QA tasks, we discover that smaller models benefit more from the disambiguation process, likely due to their limited inherent capacity for handling referential ambiguity.

What carries the argument

Coreference resolution applied as preprocessing to reduce referential ambiguity in retrieved documents, enabling clearer in-context learning for the generation step.

If this is right

Mean pooling yields better retrieval results than other strategies once coreferences are resolved.
Smaller language models receive larger QA accuracy gains from the disambiguation than larger models.
Overall RAG response quality rises because referential ambiguity no longer interferes with factual grounding.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Standard RAG pipelines might adopt coreference resolution as routine preprocessing for any knowledge-intensive task.
The same disambiguation step could reduce errors in related settings that also rely on long retrieved contexts.
Testing across varied document domains would show whether the reported gains remain stable outside the evaluated setting.

Load-bearing premise

Coreferential complexity in retrieved documents is the main source of ambiguity that disrupts performance, and resolving it yields gains independent of document length or domain.

What would settle it

Compare retrieval precision and QA accuracy on the same document set before and after removing the coreference resolution step; a clear drop without it would support the claim.

Figures

Figures reproduced from arXiv: 2507.07847 by Chanjun Park, Heuiseok Lim, Junyoung Son, Seongtae Hong, Sungjin Park, Youngjoon Jang.

read the original abstract

Retrieval-Augmented Generation (RAG) has emerged as a crucial framework in natural language processing (NLP), improving factual consistency and reducing hallucinations by integrating external document retrieval with large language models (LLMs). However, the effectiveness of RAG is often hindered by coreferential complexity in retrieved documents, introducing ambiguity that disrupts in-context learning. In this study, we systematically investigate how entity coreference affects both document retrieval and generative performance in RAG-based systems, focusing on retrieval relevance, contextual understanding, and overall response quality. We demonstrate that coreference resolution enhances retrieval effectiveness and improves question-answering (QA) performance. Through comparative analysis of different pooling strategies in retrieval tasks, we find that mean pooling demonstrates superior context capturing ability after applying coreference resolution. In QA tasks, we discover that smaller models benefit more from the disambiguation process, likely due to their limited inherent capacity for handling referential ambiguity. With these findings, this study aims to provide a deeper understanding of the challenges posed by coreferential complexity in RAG, providing guidance for improving retrieval and generation in knowledge-intensive AI applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Coref resolution boosts RAG metrics here but length changes from the step could explain part of the lift.

read the letter

The main thing to know is that this paper reports clear gains from running coreference resolution on retrieved documents before feeding them into RAG: better retrieval relevance and stronger QA answers, with the biggest relative improvements for smaller models and when using mean pooling. They compare several pooling options and show how model size moderates the effect, which is a useful practical observation for anyone tuning retrieval pipelines on modest hardware. The experiments appear systematic enough to give practitioners a low-cost preprocessing idea worth testing in their own setups. The work stays empirical and cites the relevant RAG and coreference literature without stretching into new theory. The soft spot is the length confound the stress-test flags. Coreference resolution routinely alters token counts and passage structure, and if the with-resolution and without-resolution conditions were not length-matched or ablated for those side effects, some of the measured improvement could trace to shorter contexts or shifted embedding statistics rather than disambiguation alone. The abstract gives no sign of those controls, so the methods section needs close checking to confirm the mechanism is isolated. This paper is aimed at engineers and researchers who already run RAG and want simple, measurable tweaks rather than new architectures. A reader focused on empirical refinements for knowledge-intensive tasks would get concrete takeaways on pooling and model-size interactions. It deserves peer review so referees can verify the controls, run the stats, and see whether the gains hold after length is held constant.

Referee Report

1 major / 1 minor

Summary. The manuscript examines the role of coreference resolution as a preprocessing step in Retrieval-Augmented Generation (RAG) pipelines. It claims that resolving referential ambiguity in retrieved documents improves retrieval effectiveness (particularly under mean pooling) and downstream QA performance, with smaller LLMs showing larger gains from the disambiguation process.

Significance. If the central empirical claims hold after addressing potential confounds, the work would offer a concrete, low-cost intervention for reducing ambiguity in RAG contexts and would supply practical guidance on when coreference resolution is most beneficial (e.g., for smaller models). The absence of machine-checked proofs or parameter-free derivations is expected for an empirical study; the value would lie in reproducible experimental comparisons.

major comments (1)

[Experimental setup and results (comparative analysis of retrieval and QA tasks)] The central claim that observed gains are attributable to removal of referential ambiguity rather than incidental side-effects of the preprocessing step is not yet supported. Coreference resolution typically shortens or restructures passages; without explicit controls that hold document length, token count, or lexical density constant across the with/without-resolution conditions, any reported lift in retrieval or QA metrics could be explained by reduced context length or altered embedding statistics instead of the intended mechanism. This issue directly undermines the attribution in the abstract and the weakest assumption identified in the reader's report.

minor comments (1)

[Abstract] The abstract asserts clear improvements yet supplies no datasets, metrics, statistical tests, baseline comparisons, or error analysis; these details must be added to the main text with sufficient precision for reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. The concern about potential confounds is well-taken, and we address it directly below by outlining planned revisions that will strengthen the attribution of observed gains to coreference resolution.

read point-by-point responses

Referee: [Experimental setup and results (comparative analysis of retrieval and QA tasks)] The central claim that observed gains are attributable to removal of referential ambiguity rather than incidental side-effects of the preprocessing step is not yet supported. Coreference resolution typically shortens or restructures passages; without explicit controls that hold document length, token count, or lexical density constant across the with/without-resolution conditions, any reported lift in retrieval or QA metrics could be explained by reduced context length or altered embedding statistics instead of the intended mechanism. This issue directly undermines the attribution in the abstract and the weakest assumption identified in the reader's report.

Authors: We agree that this is a substantive concern and that the current experiments do not fully isolate referential disambiguation from incidental effects of passage restructuring or length change. Coreference resolution can indeed alter token counts and lexical properties as a byproduct. To address this, we will add controlled experiments in the revised manuscript: (1) length-matched conditions in which resolved passages are truncated or padded to match the original token length before embedding; (2) explicit reporting of average token counts, sentence lengths, and lexical density for both resolved and unresolved conditions; and (3) an auxiliary baseline that applies artificial shortening without coreference resolution to test whether length reduction alone accounts for the gains. These additions will allow readers to evaluate whether the improvements persist when length and density are held constant, thereby strengthening the causal link to ambiguity removal. We view this as a necessary and feasible revision. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical comparison with no derivations or self-referential reductions

full rationale

The paper conducts an empirical investigation of coreference resolution's impact on RAG retrieval and QA performance via direct before/after comparisons and pooling strategy analysis. No equations, derivations, fitted parameters, or self-citations are invoked as load-bearing premises for the central claims; results derive from measured metrics on external benchmarks rather than reducing to inputs by construction. The work is self-contained against independent experimental controls and does not rely on uniqueness theorems, ansatzes smuggled via citation, or renaming of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an empirical study whose central claims rest on the assumption that coreference resolution produces measurable disambiguation benefits; no free parameters, invented entities, or non-standard mathematical axioms are visible from the abstract.

axioms (1)

domain assumption Coreference resolution tools can accurately replace ambiguous references with explicit entities in retrieved documents
The improvement claims presuppose that the resolution step itself is reliable and does not introduce new errors.

pith-pipeline@v0.9.0 · 5743 in / 1285 out tokens · 96086 ms · 2026-05-19T05:27:35.461765+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We demonstrate that coreference resolution enhances retrieval effectiveness and improves question-answering (QA) performance... mean pooling demonstrates superior context capturing ability after applying coreference resolution.
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

smaller models benefit more from the disambiguation process

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 11 internal anchors

[1]

arXiv preprint arXiv:2308.16884

The belebele benchmark: a parallel reading comprehension dataset in 122 lan- guage variants. arXiv preprint arXiv:2308.16884. Parishad BehnamGhader, Vaibhav Adlakha, Marius Mosbach, Dzmitry Bahdanau, Nicolas Chapados, and Siva Reddy

work page arXiv
[2]

Preprint, arXiv:2404.05961

Llm2vec: Large language mod- els are secretly powerful text encoders. Preprint, arXiv:2404.05961. Terra Blevins, Hila Gonen, and Luke Zettlemoyer

work page arXiv
[3]

Association for Computational Lin- guistics

Dense X retrieval: What retrieval granularity should we use? In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 15159–15177, Miami, Florida, USA. Association for Computational Lin- guistics. Chanyeol Choi, Junseong Kim, Seolhwa Lee, Jihoon Kwon, Sangmo Gu, Yejin Kim, Minkyung Cho, and Jy-yong Sohn

work page 2024
[4]

arXiv preprint arXiv:2412.03223

Linq-embed-mistral technical report. arXiv preprint arXiv:2412.03223. Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, and Kristina Toutanova

work page arXiv
[5]

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Boolq: Exploring the surprising difficulty of natural yes/no questions. arXiv preprint arXiv:1905.10044. Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel S Weld

work page internal anchor Pith review Pith/arXiv arXiv 1905
[6]

arXiv preprint arXiv:2004.07180

Specter: Document-level representation learning using citation-informed transformers. arXiv preprint arXiv:2004.07180. Pradeep Dasigi, Nelson F Liu, Ana Marasovi´c, Noah A Smith, and Matt Gardner

work page arXiv 2004
[7]

arXiv preprint arXiv:1908.05803

Quoref: A reading comprehension dataset with questions re- quiring coreferential reasoning. arXiv preprint arXiv:1908.05803. Timothy Desmet and Edward Gibson

work page arXiv 1908
[8]

The Llama 3 Herd of Models

The llama 3 herd of models. arXiv preprint arXiv:2407.21783. Wensheng Gan, Zhenlian Qi, Jiayang Wu, and Jerry Chun-Wei Lin

work page internal anchor Pith review Pith/arXiv arXiv
[9]

In 2023 IEEE in- ternational conference on big data (BigData) , pages 4776–4785

Large language models in ed- ucation: Vision and opportunities. In 2023 IEEE in- ternational conference on big data (BigData) , pages 4776–4785. IEEE. Yujian Gan, Massimo Poesio, and Juntao Yu

work page 2023
[10]

In Proceedings of the 2024 Joint International Conference on Computa- tional Linguistics, Language Resources and Evalua- tion (LREC-COLING 2024), pages 1645–1665

As- sessing the capabilities of large language models in coreference: An evaluation. In Proceedings of the 2024 Joint International Conference on Computa- tional Linguistics, Language Resources and Evalua- tion (LREC-COLING 2024), pages 1645–1665. Matthew Honnibal and Ines Montani

work page 2024
[11]

GPT-4o System Card

Gpt-4o system card. arXiv preprint arXiv:2410.21276. Albert Q Jiang, Alexandre Sablayrolles, Arthur Men- sch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guil- laume Lample, Lucile Saulnier, et al

work page internal anchor Pith review Pith/arXiv arXiv
[12]

Mistral 7B

Mistral 7b. arXiv preprint arXiv:2310.06825. Ben Kantor and Amir Globerson

work page internal anchor Pith review Pith/arXiv arXiv
[13]

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Nv-embed: Improved techniques for training llms as generalist embedding models. Preprint, arXiv:2405.17428. Kenton Lee, Luheng He, Mike Lewis, and Luke Zettle- moyer

work page internal anchor Pith review Pith/arXiv arXiv
[14]

In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 188–197, Copenhagen, Denmark

End-to-end neural coreference reso- lution. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 188–197, Copenhagen, Denmark. Association for Computational Linguistics. Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, and Meishan Zhang

work page 2017
[15]

Towards General Text Embeddings with Multi-stage Contrastive Learning

Towards general text embeddings with multi-stage contrastive learning. arXiv preprint arXiv:2308.03281. Yanming Liu, Xinyue Peng, Jiannan Cao, Shi Bo, Yanxin Shen, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, and Tianyu Du

work page internal anchor Pith review Pith/arXiv arXiv
[16]

arXiv preprint arXiv:2410.01671

Bridg- ing context gaps: Leveraging coreference resolution for long contextual understanding. arXiv preprint arXiv:2410.01671. Christopher D Manning, Kevin Clark, John Hewitt, Ur- vashi Khandelwal, and Omer Levy

work page arXiv
[17]

Know What You Don't Know: Unanswerable Questions for SQuAD

Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:1806.03822. Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H Chi, Nathanael Schärli, and Denny Zhou

work page internal anchor Pith review Pith/arXiv arXiv
[18]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei

work page internal anchor Pith review Pith/arXiv arXiv
[19]

Text Embeddings by Weakly-Supervised Contrastive Pre-training

Text embeddings by weakly- supervised contrastive pre-training. arXiv preprint arXiv:2212.03533. Mingzhu Wu, Nafise Sadat Moosavi, Dan Roth, and Iryna Gurevych

work page internal anchor Pith review Pith/arXiv arXiv
[20]

C-Pack: Packed Resources For General Chinese Embeddings

C-pack: Packaged resources to advance general chinese embedding. Preprint, arXiv:2309.07597. An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al

work page internal anchor Pith review Pith/arXiv arXiv
[21]

Qwen2.5 Technical Report

Qwen2. 5 tech- nical report. arXiv preprint arXiv:2412.15115. Rui Yang, Ting Fang Tan, Wei Lu, Arun James Thirunavukarasu, Daniel Shu Wei Ting, and Nan Liu

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Preprint, arXiv:2412.19048

Jasper and stella: distillation of sota embedding models. Preprint, arXiv:2412.19048. Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, and Min Zhang

work page arXiv
[23]

Preprint, arXiv:2407.19669

mgte: Gener- alized long-context text representation and rerank- ing models for multilingual text retrieval. Preprint, arXiv:2407.19669. Shuai Zhao, Fucheng You, Wen Chang, Tianyu Zhang, and Man Hu

work page arXiv
[24]

Document:

Augment bert with average pool- ing layer for chinese summary generation. Journal of Intelligent & Fuzzy Systems , 42(3):1859–1868. A Related Work A.1 Coreference Resolution Coreference Resolution plays a crucial role in un- derstanding and representing text. Previous studies have demonstrated that accurately identifying and linking expressions referring ...

work page arXiv 1977

[1] [1]

arXiv preprint arXiv:2308.16884

The belebele benchmark: a parallel reading comprehension dataset in 122 lan- guage variants. arXiv preprint arXiv:2308.16884. Parishad BehnamGhader, Vaibhav Adlakha, Marius Mosbach, Dzmitry Bahdanau, Nicolas Chapados, and Siva Reddy

work page arXiv

[2] [2]

Preprint, arXiv:2404.05961

Llm2vec: Large language mod- els are secretly powerful text encoders. Preprint, arXiv:2404.05961. Terra Blevins, Hila Gonen, and Luke Zettlemoyer

work page arXiv

[3] [3]

Association for Computational Lin- guistics

Dense X retrieval: What retrieval granularity should we use? In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 15159–15177, Miami, Florida, USA. Association for Computational Lin- guistics. Chanyeol Choi, Junseong Kim, Seolhwa Lee, Jihoon Kwon, Sangmo Gu, Yejin Kim, Minkyung Cho, and Jy-yong Sohn

work page 2024

[4] [4]

arXiv preprint arXiv:2412.03223

Linq-embed-mistral technical report. arXiv preprint arXiv:2412.03223. Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, and Kristina Toutanova

work page arXiv

[5] [5]

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

Boolq: Exploring the surprising difficulty of natural yes/no questions. arXiv preprint arXiv:1905.10044. Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, and Daniel S Weld

work page internal anchor Pith review Pith/arXiv arXiv 1905

[6] [6]

arXiv preprint arXiv:2004.07180

Specter: Document-level representation learning using citation-informed transformers. arXiv preprint arXiv:2004.07180. Pradeep Dasigi, Nelson F Liu, Ana Marasovi´c, Noah A Smith, and Matt Gardner

work page arXiv 2004

[7] [7]

arXiv preprint arXiv:1908.05803

Quoref: A reading comprehension dataset with questions re- quiring coreferential reasoning. arXiv preprint arXiv:1908.05803. Timothy Desmet and Edward Gibson

work page arXiv 1908

[8] [8]

The Llama 3 Herd of Models

The llama 3 herd of models. arXiv preprint arXiv:2407.21783. Wensheng Gan, Zhenlian Qi, Jiayang Wu, and Jerry Chun-Wei Lin

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

In 2023 IEEE in- ternational conference on big data (BigData) , pages 4776–4785

Large language models in ed- ucation: Vision and opportunities. In 2023 IEEE in- ternational conference on big data (BigData) , pages 4776–4785. IEEE. Yujian Gan, Massimo Poesio, and Juntao Yu

work page 2023

[10] [10]

In Proceedings of the 2024 Joint International Conference on Computa- tional Linguistics, Language Resources and Evalua- tion (LREC-COLING 2024), pages 1645–1665

As- sessing the capabilities of large language models in coreference: An evaluation. In Proceedings of the 2024 Joint International Conference on Computa- tional Linguistics, Language Resources and Evalua- tion (LREC-COLING 2024), pages 1645–1665. Matthew Honnibal and Ines Montani

work page 2024

[11] [11]

GPT-4o System Card

Gpt-4o system card. arXiv preprint arXiv:2410.21276. Albert Q Jiang, Alexandre Sablayrolles, Arthur Men- sch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guil- laume Lample, Lucile Saulnier, et al

work page internal anchor Pith review Pith/arXiv arXiv

[12] [12]

Mistral 7B

Mistral 7b. arXiv preprint arXiv:2310.06825. Ben Kantor and Amir Globerson

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Nv-embed: Improved techniques for training llms as generalist embedding models. Preprint, arXiv:2405.17428. Kenton Lee, Luheng He, Mike Lewis, and Luke Zettle- moyer

work page internal anchor Pith review Pith/arXiv arXiv

[14] [14]

In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 188–197, Copenhagen, Denmark

End-to-end neural coreference reso- lution. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 188–197, Copenhagen, Denmark. Association for Computational Linguistics. Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, and Meishan Zhang

work page 2017

[15] [15]

Towards General Text Embeddings with Multi-stage Contrastive Learning

Towards general text embeddings with multi-stage contrastive learning. arXiv preprint arXiv:2308.03281. Yanming Liu, Xinyue Peng, Jiannan Cao, Shi Bo, Yanxin Shen, Xuhong Zhang, Sheng Cheng, Xun Wang, Jianwei Yin, and Tianyu Du

work page internal anchor Pith review Pith/arXiv arXiv

[16] [16]

arXiv preprint arXiv:2410.01671

Bridg- ing context gaps: Leveraging coreference resolution for long contextual understanding. arXiv preprint arXiv:2410.01671. Christopher D Manning, Kevin Clark, John Hewitt, Ur- vashi Khandelwal, and Omer Levy

work page arXiv

[17] [17]

Know What You Don't Know: Unanswerable Questions for SQuAD

Know what you don’t know: Unanswerable questions for squad. arXiv preprint arXiv:1806.03822. Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed H Chi, Nathanael Schärli, and Denny Zhou

work page internal anchor Pith review Pith/arXiv arXiv

[18] [18]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma 2: Improving open language models at a practical size. arXiv preprint arXiv:2408.00118. Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, and Furu Wei

work page internal anchor Pith review Pith/arXiv arXiv

[19] [19]

Text Embeddings by Weakly-Supervised Contrastive Pre-training

Text embeddings by weakly- supervised contrastive pre-training. arXiv preprint arXiv:2212.03533. Mingzhu Wu, Nafise Sadat Moosavi, Dan Roth, and Iryna Gurevych

work page internal anchor Pith review Pith/arXiv arXiv

[20] [20]

C-Pack: Packed Resources For General Chinese Embeddings

C-pack: Packaged resources to advance general chinese embedding. Preprint, arXiv:2309.07597. An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al

work page internal anchor Pith review Pith/arXiv arXiv

[21] [21]

Qwen2.5 Technical Report

Qwen2. 5 tech- nical report. arXiv preprint arXiv:2412.15115. Rui Yang, Ting Fang Tan, Wei Lu, Arun James Thirunavukarasu, Daniel Shu Wei Ting, and Nan Liu

work page internal anchor Pith review Pith/arXiv arXiv

[22] [22]

Preprint, arXiv:2412.19048

Jasper and stella: distillation of sota embedding models. Preprint, arXiv:2412.19048. Xin Zhang, Yanzhao Zhang, Dingkun Long, Wen Xie, Ziqi Dai, Jialong Tang, Huan Lin, Baosong Yang, Pengjun Xie, Fei Huang, Meishan Zhang, Wenjie Li, and Min Zhang

work page arXiv

[23] [23]

Preprint, arXiv:2407.19669

mgte: Gener- alized long-context text representation and rerank- ing models for multilingual text retrieval. Preprint, arXiv:2407.19669. Shuai Zhao, Fucheng You, Wen Chang, Tianyu Zhang, and Man Hu

work page arXiv

[24] [24]

Document:

Augment bert with average pool- ing layer for chinese summary generation. Journal of Intelligent & Fuzzy Systems , 42(3):1859–1868. A Related Work A.1 Coreference Resolution Coreference Resolution plays a crucial role in un- derstanding and representing text. Previous studies have demonstrated that accurately identifying and linking expressions referring ...

work page arXiv 1977