OlymMATH is a 350-problem Olympiad math benchmark combining bilingual natural-language evaluation with Lean 4 formal verification to test LLM reasoning.
Chatglm: A family of large language models from glm-130b to glm-4 all tools
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
SynMotion combines disentangled semantic embeddings, parameter-efficient motion adapters, and alternate subject-motion training on a new SPV dataset to improve motion customization in text-to-video and image-to-video generation.
citing papers explorer
-
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
OlymMATH is a 350-problem Olympiad math benchmark combining bilingual natural-language evaluation with Lean 4 formal verification to test LLM reasoning.
-
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
WE-MATH benchmark reveals most LMMs rely on rote memorization for visual math while GPT-4o has shifted toward knowledge generalization.
-
Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
-
SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation
SynMotion combines disentangled semantic embeddings, parameter-efficient motion adapters, and alternate subject-motion training on a new SPV dataset to improve motion customization in text-to-video and image-to-video generation.