Presents FoodSense dataset and FoodSense-VL model for predicting multisensory food properties and explanations directly from images.
Food-500 cap: A fine-grained food caption benchmark for evaluating vision-language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images
Presents FoodSense dataset and FoodSense-VL model for predicting multisensory food properties and explanations directly from images.