Denoising autoencoder pretraining on corrupted visual embeddings yields more robust Med-VQA performance on SLAKE and PathVQA while using LoRA for efficient LLM adaptation.
Applied Sciences15(6) (2025).https://doi
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Applies PEFT to Florence-2 for GI endoscopy VQA and LoRA-adapted Stable Diffusion 2.1 for synthetic image generation, reporting ROUGE/BLEU gains and image quality metrics on Kvasir-VQA.
citing papers explorer
-
Noise-Aware Visual Representation Learning for Medical Visual Question Answering
Denoising autoencoder pretraining on corrupted visual embeddings yields more robust Med-VQA performance on SLAKE and PathVQA while using LoRA for efficient LLM adaptation.
-
Parameter-Efficient VLMs for Gastrointestinal Endoscopy: Medical Image Generation and Clinical Visual Question Answering
Applies PEFT to Florence-2 for GI endoscopy VQA and LoRA-adapted Stable Diffusion 2.1 for synthetic image generation, reporting ROUGE/BLEU gains and image quality metrics on Kvasir-VQA.