Infimm-webmath-40b: Advancing mul- timodal pre-training for enhanced mathematical reasoning

Xiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, et al · 2024 · arXiv 2409.12568

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems

cs.CV · 2025-03-19 · unverdicted · novelty 6.0

MathFlow decouples perception and inference stages in MLLMs for visual math, with a dedicated perception model delivering gains on the FlowVerse benchmark when paired with existing reasoners.

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

cs.CL · 2025-02-04 · unverdicted · novelty 5.0

SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.

Music Audio-Visual Question Answering Requires Specialized Multimodal Designs

cs.SD · 2025-05-27 · unverdicted · novelty 3.0

Survey of Music AVQA finds specialized input processing, dedicated spatial-temporal designs, and music-specific modeling are critical for strong performance.

citing papers explorer

Showing 3 of 3 citing papers.

MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems cs.CV · 2025-03-19 · unverdicted · none · ref 25
MathFlow decouples perception and inference stages in MLLMs for visual math, with a dedicated perception model delivering gains on the FlowVerse benchmark when paired with existing reasoners.
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model cs.CL · 2025-02-04 · unverdicted · none · ref 178
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
Music Audio-Visual Question Answering Requires Specialized Multimodal Designs cs.SD · 2025-05-27 · unverdicted · none · ref 6
Survey of Music AVQA finds specialized input processing, dedicated spatial-temporal designs, and music-specific modeling are critical for strong performance.

Infimm-webmath-40b: Advancing mul- timodal pre-training for enhanced mathematical reasoning

fields

years

verdicts

representative citing papers

citing papers explorer