pith. sign in

arxiv: 2511.00801 · v4 · pith:PDY2WJJInew · submitted 2025-11-02 · 💻 cs.CV · cs.MM

Med-Banana: Learning Quality-Controlled Medical Image Editing from Success-and-Failure Trajectories

classification 💻 cs.CV cs.MM
keywords editingimagemedicalmed-bananaacceptedattemptseditorsquality-controlled
0
0 comments X
read the original abstract

Text-guided medical image editing must satisfy the requested pathology while preserving anatomy, modality-specific appearance, and clinical plausibility. However, existing datasets largely supervise editors with final accepted edits and discard the failed attempts produced during generation. We argue that these failures provide essential supervision for quality control: they specify what should be rejected, why an edit is medically or visually invalid, and how the instruction should be revised. We present Med-Banana, a trajectory-supervised framework for quality-controlled medical image editing. We introduce Med-Banana-80K, a large-scale resource of success-and-failure editing trajectories with candidate images, verification outcomes, rejection reasons, and prompt refinements. Building on it, Med-Banana jointly trains an editor, verifier, and refiner, enabling edit--verify--refine inference from accepted and rejected attempts. Experiments across MLLM judges, blind expert assessment, source-preservation and real--synthetic separability probes demonstrate consistent improvements over open medical image editors. Code and data are publicly available.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. LLM-as-a-Judge in Healthcare: A Scoping Analysis of Applications, Methods, and Human Alignment

    cs.CY 2026-05 unverdicted novelty 6.0

    Scoping review of 134 studies on LLM-as-a-Judge in healthcare finds concentration in clinical decision support and NLP, frequent use of OpenAI models with prompt engineering, and moderate-to-strong human alignment whe...

  2. Beyond Visual Forensics: Auditing Multimodal Robustness for Synthetic Medical Image Detection

    cs.CV 2026-06 unverdicted novelty 5.0

    VLMs for synthetic medical image detection overweight accompanying metadata, as demonstrated by a new paired benchmark that fixes the image and varies controlled text variants across imaging modalities.