SIRA mitigates hallucinations in LVLMs by internally contrasting full visual access against a masked late-layer branch that retains shared context but lacks fine-grained visual evidence.
Llama: Open and efficient foundation language models
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
TableMaster improves LM table understanding by verbalizing tables with enriched semantics and using adaptive textual-symbolic reasoning, reaching 78.13% accuracy on WikiTQ with GPT-4o-mini.
ROS-LLM integrates LLMs with ROS to let non-experts specify robot tasks in natural language, supporting sequence, behavior tree, and state machine modes plus imitation learning and reflection on feedback.
ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.
citing papers explorer
-
Do We Really Need External Tools to Mitigate Hallucinations? SIRA: Shared-Prefix Internal Reconstruction of Attribution
SIRA mitigates hallucinations in LVLMs by internally contrasting full visual access against a masked late-layer branch that retains shared context but lacks fine-grained visual evidence.
-
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
-
TableMaster: A Recipe to Advance Table Understanding with Language Models
TableMaster improves LM table understanding by verbalizing tables with enriched semantics and using adaptive textual-symbolic reasoning, reaching 78.13% accuracy on WikiTQ with GPT-4o-mini.
-
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
ROS-LLM integrates LLMs with ROS to let non-experts specify robot tasks in natural language, supporting sequence, behavior tree, and state machine modes plus imitation learning and reflection on feedback.
-
ModelScope Text-to-Video Technical Report
ModelScopeT2V is a 1.7-billion-parameter text-to-video model built on Stable Diffusion that adds temporal modeling and outperforms prior methods on three evaluation metrics.