pith. sign in

arxiv: 2605.26483 · v1 · pith:VCUVSVBKnew · submitted 2026-05-26 · 💻 cs.CV

Clinically-Grounded Counterfactual Reasoning for Medical Video Diagnosis

classification 💻 cs.CV
keywords counterfactualclinicaldiagnosisdiagnosticmedvcrvideolearningmedical
0
0 comments X
read the original abstract

Medical video diagnosis involves inferring clinical decisions from dynamic tissue responses throughout examination processes. Existing methods rely on an end-to-end learning paradigm that i) focuses on appearance rather than pathology, ii) lacks clinical priors, and iii) reasons solely from observations without counterfactual comparison. This work introduces MedVCR, a counterfactual reasoning framework that mimics clinical diagnostic thinking. MedVCR comprises three components: a Counterfactual Generator that synthesizes tissue evolution under specified pathological states via a diffusion-based manner; a Counterfactual Representation Learning module that encodes diagnostic knowledge through clinical rules (i.e., temporal consistency, pathological separability, and counterfactual alignment); and a Dual Diagnostic Prediction strategy that integrates video-level assessment with frame-level counterfactual analysis. MedVCR is evaluated under both fully supervised (e.g., colposcopy) and weakly supervised (e.g., colonoscopy) video diagnosis settings, yielding 2.6%-10.2% performance gains compared with leading baselines. Comprehensive ablation studies further validate the effectiveness of each component. The code will be released.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.