PseudoBench shows current LLM agents produce persuasive pseudoscientific reports with near-zero refusal rates and at most 27.4% resistance.
Next token prediction towards multimodal intelligence: A comprehensive survey.arXiv preprint arXiv:2412.18619, 2024a
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
PathAR factorizes structure and appearance tokens via Dual-VQ and IAR transformer for modality-conditioned pathology image synthesis with improved structural consistency.
SimReg regularization accelerates LLM pretraining convergence by over 30% and raises average zero-shot performance by over 1% across benchmarks.
A roadmap that defines architectural nativity for multimodal models and categorizes them into Multi-to-Text, Multi-to-Target, and Multi-to-Multi types while outlining an industrial pipeline toward unified transformer-based native multimodal modeling.
A literature survey that organizes spoken language models by architecture, training, and evaluation choices and identifies key challenges and future directions.
citing papers explorer
-
PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience
PseudoBench shows current LLM agents produce persuasive pseudoscientific reports with near-zero refusal rates and at most 27.4% resistance.
-
PathAR: Structure-First Autoregressive Synthesis of Multimodal Pathology Images
PathAR factorizes structure and appearance tokens via Dual-VQ and IAR transformer for modality-conditioned pathology image synthesis with improved structural consistency.
-
SimReg: Achieving Higher Performance in the Pretraining via Embedding Similarity Regularization
SimReg regularization accelerates LLM pretraining convergence by over 30% and raises average zero-shot performance by over 1% across benchmarks.
-
Toward Native Multimodal Modeling: A Roadmap
A roadmap that defines architectural nativity for multimodal models and categorizes them into Multi-to-Text, Multi-to-Target, and Multi-to-Multi types while outlining an industrial pipeline toward unified transformer-based native multimodal modeling.
-
On The Landscape of Spoken Language Models: A Comprehensive Survey
A literature survey that organizes spoken language models by architecture, training, and evaluation choices and identifies key challenges and future directions.
- NITP: Next Implicit Token Prediction for LLM Pre-training
- WISE: A World Knowledge-Informed Semantic Evaluation for Text-to-Image Generation