Introduces OCR-Robust benchmark and evaluates 18 VLMs showing clean accuracy does not guarantee robustness with charts and tables more fragile than documents under selected perturbations.
Semantic image synthesis with spatially-adaptive normalization
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
A model-agnostic Geometric Risk Controller reduces extreme errors in VLM-based OCR by requiring cross-view consensus before accepting outputs.
Introduces the CIFAR Synthetic Evidence Corpus, a multi-family dataset of AI-manipulated documents with source-separated train/test splits for evaluating detectors of AI-generated legal evidence.
SPADE-LDM conditional synthesis from composite semantic masks produces realistic 3D LGE MRI that raises LA cavity Dice from 0.908 to 0.936.
citing papers explorer
-
How Robust is OCR-Reasoning? Evaluating OCR-Reasoning Robustness of Vision-Language Models under Visual Perturbations
Introduces OCR-Robust benchmark and evaluates 18 VLMs showing clean accuracy does not guarantee robustness with charts and tables more fragile than documents under selected perturbations.
-
From Plausibility to Verifiability: Risk-Controlled Generative OCR with Vision-Language Models
A model-agnostic Geometric Risk Controller reduces extreme errors in VLM-based OCR by requiring cross-view consensus before accepting outputs.
-
The CIFAR Synthetic Evidence Corpus for Detecting AI-Generated Evidence
Introduces the CIFAR Synthetic Evidence Corpus, a multi-family dataset of AI-manipulated documents with source-separated train/test splits for evaluating detectors of AI-generated legal evidence.
-
3D Conditional Image Synthesis of Left Atrial LGE MRI from Composite Semantic Masks
SPADE-LDM conditional synthesis from composite semantic masks produces realistic 3D LGE MRI that raises LA cavity Dice from 0.908 to 0.936.