Title resolution pending

Lost in the middle, in-between: Enhancing language models’ ability to reason over long contexts in multi-hop qa · 2025 · arXiv 2412.10079

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Failure Modes in Multi-Hop QA: The Weakest Link Effect and the Recognition Bottleneck

cs.AI · 2026-01-18 · unverdicted · novelty 6.0

LLMs exhibit a Weakest Link Effect in multi-hop QA where performance collapses to the least visible evidence position; MFAI resolves recognition bottlenecks with up to 11.49% gains in low-visibility spots.

Context Convergence Improves Answering Inferential Questions

cs.CL · 2026-05-12 · unverdicted · novelty 5.0

Passages made from high-convergence sentences improve LLM performance on inferential questions compared to cosine similarity selection.

UserGPT Technical Report

cs.IR · 2026-05-09 · unverdicted · novelty 5.0

UserGPT introduces a generative LLM framework with a behavior simulation engine, semantization module, and DF-GRPO post-training that scores 0.7325 on tag prediction and 0.7528 on summary generation on HPR-Bench while compressing records by up to 97.9%.

citing papers explorer

Showing 3 of 3 citing papers.

Failure Modes in Multi-Hop QA: The Weakest Link Effect and the Recognition Bottleneck cs.AI · 2026-01-18 · unverdicted · none · ref 2
LLMs exhibit a Weakest Link Effect in multi-hop QA where performance collapses to the least visible evidence position; MFAI resolves recognition bottlenecks with up to 11.49% gains in low-visibility spots.
Context Convergence Improves Answering Inferential Questions cs.CL · 2026-05-12 · unverdicted · none · ref 2
Passages made from high-convergence sentences improve LLM performance on inferential questions compared to cosine similarity selection.
UserGPT Technical Report cs.IR · 2026-05-09 · unverdicted · none · ref 55
UserGPT introduces a generative LLM framework with a behavior simulation engine, semantization module, and DF-GRPO post-training that scores 0.7325 on tag prediction and 0.7528 on summary generation on HPR-Bench while compressing records by up to 97.9%.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer