Food-500 cap: A fine-grained food caption benchmark for evaluating vision-language models

Zheng Ma, Mianzhi Pan, Wenhan Wu, Kanzhi Cheng, Jianbing Zhang, Shujian Huang, Jiajun Chen · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CV · 2026-04-15 · unverdicted · novelty 7.0

Presents FoodSense dataset and FoodSense-VL model for predicting multisensory food properties and explanations directly from images.

Showing 1 of 1 citing paper.

FoodSense: A Multisensory Food Dataset and Benchmark for Predicting Taste, Smell, Texture, and Sound from Images cs.CV · 2026-04-15 · unverdicted · none · ref 20
Presents FoodSense dataset and FoodSense-VL model for predicting multisensory food properties and explanations directly from images.