Gonzalez, Ion Stoica, and Eric P

Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E · 2023

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

Evaluating Object Hallucination in Large Vision-Language Models

cs.CV · 2023-05-17 · accept · novelty 7.0

Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.

TempCompass: Do Video LLMs Really Understand Videos?

cs.CV · 2024-03-01 · unverdicted · novelty 6.0

TempCompass benchmark reveals that state-of-the-art Video LLMs have poor ability to perceive temporal aspects such as speed, direction, and ordering in videos.

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

cs.CL · 2023-05-23 · conditional · novelty 6.0

UltraChat supplies 1.5 million high-quality multi-turn dialogues that, when used to fine-tune LLaMA, produce UltraLLaMA, which outperforms prior open-source chat models including Vicuna.

SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

cs.CL · 2024-10-07 · unverdicted · novelty 5.0

SFTMix applies mixup regularization to confidence-stratified interpolated examples during LLM instruction tuning to achieve consistent gains across models and datasets.

citing papers explorer

Showing 4 of 4 citing papers.

Evaluating Object Hallucination in Large Vision-Language Models cs.CV · 2023-05-17 · accept · none · ref 9
Large vision-language models exhibit severe object hallucination that varies with training instructions, and the proposed POPE polling method evaluates it more stably and flexibly than prior approaches.
TempCompass: Do Video LLMs Really Understand Videos? cs.CV · 2024-03-01 · unverdicted · none · ref 78
TempCompass benchmark reveals that state-of-the-art Video LLMs have poor ability to perceive temporal aspects such as speed, direction, and ordering in videos.
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations cs.CL · 2023-05-23 · conditional · none · ref 235
UltraChat supplies 1.5 million high-quality multi-turn dialogues that, when used to fine-tune LLaMA, produce UltraLLaMA, which outperforms prior open-source chat models including Vicuna.
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe cs.CL · 2024-10-07 · unverdicted · none · ref 14
SFTMix applies mixup regularization to confidence-stratified interpolated examples during LLM instruction tuning to achieve consistent gains across models and datasets.

Gonzalez, Ion Stoica, and Eric P

fields

years

verdicts

representative citing papers

citing papers explorer