Faithfulness-QA is a 99k-sample dataset created via counterfactual entity substitution on existing QA benchmarks to train and evaluate context-faithful RAG models.
ClashE- val: Quantifying the tug-of-war between an LLM’s internal prior and external evidence
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it