VitaTouch combines vision-tactile encoders with a dual Q-Former and contrastive alignment to an LLM, achieving 88.89% hardness and 75.13% roughness accuracy on a new 186-object dataset plus 94% success in robotic sorting trials.
LoRA: Low-rank adaptation of large language models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
CGSD framework reaches 87.5% accuracy and 0.731 macro F1 on APTOS 2019 by conditioning diffusion denoising on dot-product vectors from image features and DR-grade text descriptions.
citing papers explorer
-
VitaTouch: Property-Aware Vision-Tactile-Language Model for Robotic Quality Inspection in Manufacturing
VitaTouch combines vision-tactile encoders with a dual Q-Former and contrastive alignment to an LLM, achieving 88.89% hardness and 75.13% roughness accuracy on a new 186-object dataset plus 94% success in robotic sorting trials.
-
Cross-Modal Semantic-Enhanced Diffusion Framework for Diabetic Retinopathy Grading
CGSD framework reaches 87.5% accuracy and 0.731 macro F1 on APTOS 2019 by conditioning diffusion denoising on dot-product vectors from image features and DR-grade text descriptions.