Verifact: Verifying facts in llm-generated clinical text with electronic health records.arXiv preprint arXiv:2501.16672

Philip Chung, Akshay Swaminathan, Alex J · 2025 · arXiv 2501.16672

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

LLM-as-a-Judge in Healthcare: A Scoping Analysis of Applications, Methods, and Human Alignment

cs.CY · 2026-05-24 · unverdicted · novelty 6.0

Scoping review of 134 studies on LLM-as-a-Judge in healthcare finds concentration in clinical decision support and NLP, frequent use of OpenAI models with prompt engineering, and moderate-to-strong human alignment where validated.

Verification Mirage: Mapping the Reliability Boundary of Self-Verification in Medical VQA

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

Self-verification in medical VQA creates a verification mirage where verifiers exhibit high error and agreement bias on wrong answers, with reliability strongly conditioned on task type.

Deployment-Centered Evaluation: Predicting Query-Level Rejection Risk in a Clinical LLM System

cs.AI · 2026-06-10 · unverdicted · novelty 5.0

A pre-response classifier predicts user rejection risk for clinical LLM outputs with AUROC 0.719 over 4.5 months of deployment data by incorporating deployment-specific context.

citing papers explorer

Showing 3 of 3 citing papers after filters.

LLM-as-a-Judge in Healthcare: A Scoping Analysis of Applications, Methods, and Human Alignment cs.CY · 2026-05-24 · unverdicted · none · ref 28
Scoping review of 134 studies on LLM-as-a-Judge in healthcare finds concentration in clinical decision support and NLP, frequent use of OpenAI models with prompt engineering, and moderate-to-strong human alignment where validated.
Verification Mirage: Mapping the Reliability Boundary of Self-Verification in Medical VQA cs.CV · 2026-05-11 · unverdicted · none · ref 4
Self-verification in medical VQA creates a verification mirage where verifiers exhibit high error and agreement bias on wrong answers, with reliability strongly conditioned on task type.
Deployment-Centered Evaluation: Predicting Query-Level Rejection Risk in a Clinical LLM System cs.AI · 2026-06-10 · unverdicted · none · ref 8
A pre-response classifier predicts user rejection risk for clinical LLM outputs with AUROC 0.719 over 4.5 months of deployment data by incorporating deployment-specific context.

Verifact: Verifying facts in llm-generated clinical text with electronic health records.arXiv preprint arXiv:2501.16672

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer