pith. machine review for the scientific record. sign in

Cmath: Can your language model pass chinese elementary school math test?arXiv preprint arXiv:2306.16636

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 4 cs.CL 2

years

2026 4 2025 2

roles

background 1

polarities

background 1

representative citing papers

Validity-Calibrated Reasoning Distillation

cs.LG · 2026-04-14 · unverdicted · novelty 7.0 · 2 refs

Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.

Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization

cs.LG · 2026-03-09 · unverdicted · novelty 6.0

CAMEL is a scaling law capturing nonlinear model-size and mixture interactions to extrapolate optimal data mixtures for large LLMs from small-model experiments, reducing optimization cost by 50% and improving benchmarks by up to 3%.

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

cs.LG · 2025-12-10 · conditional · novelty 6.0

LLaDA2.0 scales discrete diffusion language models to 100B parameters via systematic conversion from autoregressive models using a 3-phase WSD training scheme and releases open-source 16B and 100B MoE variants.

Kimi K2: Open Agentic Intelligence

cs.LG · 2025-07-28 · unverdicted · novelty 5.0

Kimi K2 is a 1-trillion-parameter MoE model that leads open-source non-thinking models on agentic benchmarks including 65.8 on SWE-Bench Verified and 66.1 on Tau2-Bench.

citing papers explorer

Showing 6 of 6 citing papers.

  • Remask, Don't Replace: Token-to-Mask Refinement in Diffusion Large Language Models cs.CL · 2026-04-20 · unverdicted · none · ref 26

    Token-to-Mask remasking improves self-correction in diffusion LLMs by resetting erroneous commitments to masks rather than overwriting them, yielding +13.33 points on AIME 2025 and +8.56 on CMATH.

  • Validity-Calibrated Reasoning Distillation cs.LG · 2026-04-14 · unverdicted · none · ref 25 · 2 links

    Validity-calibrated reasoning distillation improves transfer of reasoning skills by modulating updates based on relative local validity of next steps instead of enforcing full trajectory imitation.

  • Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization cs.LG · 2026-03-09 · unverdicted · none · ref 29

    CAMEL is a scaling law capturing nonlinear model-size and mixture interactions to extrapolate optimal data mixtures for large LLMs from small-model experiments, reducing optimization cost by 50% and improving benchmarks by up to 3%.

  • LLaDA2.0: Scaling Up Diffusion Language Models to 100B cs.LG · 2025-12-10 · conditional · none · ref 36

    LLaDA2.0 scales discrete diffusion language models to 100B parameters via systematic conversion from autoregressive models using a 3-phase WSD training scheme and releases open-source 16B and 100B MoE variants.

  • Parallelism and Generation Order in Masked Diffusion Language Models: Limits Today, Potential Tomorrow cs.CL · 2026-01-22 · unverdicted · none · ref 8

    MDLMs lag autoregressive models in performance because parallel modeling weakens inter-token dependencies, yet they adapt generation order to task demands and show promise in a generate-then-edit paradigm.

  • Kimi K2: Open Agentic Intelligence cs.LG · 2025-07-28 · unverdicted · none · ref 84

    Kimi K2 is a 1-trillion-parameter MoE model that leads open-source non-thinking models on agentic benchmarks including 65.8 on SWE-Bench Verified and 66.1 on Tau2-Bench.