Task Matters: Knowledge Requirements Shape LLM Responses to Context-Memory Conflict

Kaiser Sun , Fan Bai , Mark Dredze

Authors on Pith no claims yet

classification 💻 cs.CL cs.AI

keywords knowledgeconflictcontextllmstaskscontext-memorycontextualevaluation

read the original abstract

Large language models (LLMs) draw on both contextual information and parametric memory, yet these sources can conflict. Prior studies have largely examined this issue in contextual question answering, implicitly assuming that tasks should rely on the provided context, leaving unclear how LLMs behave when tasks require different types and degrees of knowledge utilization. We address this gap with a model-agnostic diagnostic framework that holds underlying knowledge constant while introducing controlled conflicts across tasks with varying knowledge demands. Experiments on representative open-weight and proprietary LLMs show that performance degradation under conflict is driven by both task-specific knowledge reliance and conflict plausibility; that strategies such as rationales or context reiteration increase context reliance, helping context-only tasks but harming those requiring parametric knowledge; and that these effects bias model-based evaluation, calling into question the reliability of LLMs as judges. Overall, our findings reveal that context-memory conflict is inherently task-dependent and motivate task-aware approaches to balancing context and memory in LLM deployment and evaluation.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

CyBiasBench: Benchmarking Bias in LLM Agents for Cyber-Attack Scenarios
cs.CR 2026-05 unverdicted novelty 7.0

LLM agents exhibit persistent attack-selection biases as fixed traits independent of success rates, with a bias momentum effect that resists steering and yields no performance gain.