Towards transparent ai: A survey on explainable large language models

Towards Transparent AI: A Survey on Explainable Large Language Models · 2025 · arXiv 2506.21812

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

Decision-Aware Attention Propagation for Vision Transformer Explainability

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.

Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs

cs.CL · 2026-04-14 · unverdicted · novelty 5.0

HETA is a new attribution framework for decoder-only LLMs that combines semantic transition vectors, Hessian-based sensitivity scores, and KL divergence to produce more faithful and human-aligned token attributions than prior methods.

Multi-agent Self-triage System with Medical Flowcharts

cs.AI · 2025-11-16 · unverdicted · novelty 5.0

A multi-agent conversational system using AMA flowcharts achieves 95.29% top-3 retrieval accuracy and 99.10% navigation accuracy on large synthetic medical conversation datasets.

Mitigating Hallucination on Hallucination in RAG via Ensemble Voting

cs.CL · 2026-03-28 · unverdicted · novelty 4.0

VOTE-RAG applies retrieval voting across diverse queries and response voting across independent generations to mitigate hallucination-on-hallucination in RAG, matching or exceeding complex baselines on six benchmarks with a parallelizable design.

Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions

cs.CY · 2026-02-27 · unverdicted · novelty 4.0

Current XAI methods for DNNs and LLMs rest on paradoxes and false assumptions that demand a paradigm shift to verification protocols, scientific foundations, context-aware design, and faithful model analysis rather than post-hoc explanations.

citing papers explorer

Showing 5 of 5 citing papers.

Decision-Aware Attention Propagation for Vision Transformer Explainability cs.CV · 2026-04-20 · unverdicted · none · ref 8
DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.
Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs cs.CL · 2026-04-14 · unverdicted · none · ref 30
HETA is a new attribution framework for decoder-only LLMs that combines semantic transition vectors, Hessian-based sensitivity scores, and KL divergence to produce more faithful and human-aligned token attributions than prior methods.
Multi-agent Self-triage System with Medical Flowcharts cs.AI · 2025-11-16 · unverdicted · none · ref 19
A multi-agent conversational system using AMA flowcharts achieves 95.29% top-3 retrieval accuracy and 99.10% navigation accuracy on large synthetic medical conversation datasets.
Mitigating Hallucination on Hallucination in RAG via Ensemble Voting cs.CL · 2026-03-28 · unverdicted · none · ref 3
VOTE-RAG applies retrieval voting across diverse queries and response voting across independent generations to mitigate hallucination-on-hallucination in RAG, matching or exceeding complex baselines on six benchmarks with a parallelizable design.
Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions cs.CY · 2026-02-27 · unverdicted · none · ref 2
Current XAI methods for DNNs and LLMs rest on paradoxes and false assumptions that demand a paradigm shift to verification protocols, scientific foundations, context-aware design, and faithful model analysis rather than post-hoc explanations.

Towards transparent ai: A survey on explainable large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer