Empirical comparison of four NLI checkers as process rewards in GRPO-trained medical RAG shows log-prob scoring collapses to neutral labels while moderate local classifiers improve BERTScore without reward hacking.
Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Time introduces fundamental challenges in model development and deployment: models are usually trained on historical data while deployed on future data where semantic distributions and domain knowledge may evolve. Unfortunately, existing studies either overlook temporal shifts or hardly capture rich shifting patterns of both semantic and knowledge. We develop Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation (KARITA) to capture diverse temporal shifts (e.g., uncertainty and feature shift), construct and integrate rich knowledge sources (e.g., medical ontology like MeSH), and leverage shifting insights for selecting-retrieval augmented learning. We evaluate KARITA on classification tasks across multiple domains, clinical, legal, and scientific corpora, demonstrating consistent improvements across multiple domains with temporal adaptation. Our results show that knowledge integration can be more critical and effective in temporal augmentation and learning.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
What Makes a Medical Checker Trainable? Diagnosing Signal Collapse and Reward Hacking in Checker-Guided RAG for Biomedical QA
Empirical comparison of four NLI checkers as process rewards in GRPO-trained medical RAG shows log-prob scoring collapses to neutral labels while moderate local classifiers improve BERTScore without reward hacking.