arXiv preprint arXiv:2310.20246 , year=

Nuo Chen, Zinan Zheng, Ning Wu, Ming Gong, Dongmei Zhang, Jia Li · 2024 · arXiv 2310.20246

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

PRIMETIME : Limits of LLMs in Temporal Primitives

cs.NE · 2025-04-22 · unverdicted · novelty 7.0

PRIMETIME generator reveals that LLM datetime parsing and arithmetic primitives are individually unreliable but fully learnable via fine-tuning, enabling frontier-level accuracy on event planning with small LoRA models.

WeatherSyn: An Instruction Tuning MLLM For Weather Forecasting Report Generation

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

WeatherSyn is the first instruction-tuned MLLM for weather forecasting report generation, outperforming closed-source models on a new dataset of 31 US cities across 8 weather aspects.

CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning

cs.AI · 2026-01-19 · unverdicted · novelty 6.0

CURE-MED pairs a new 13-language medical reasoning benchmark with curriculum RL to raise logical correctness to 70% and language consistency to 95% at 32B scale while outperforming baselines.

citing papers explorer

Showing 3 of 3 citing papers.

PRIMETIME : Limits of LLMs in Temporal Primitives cs.NE · 2025-04-22 · unverdicted · none · ref 58
PRIMETIME generator reveals that LLM datetime parsing and arithmetic primitives are individually unreliable but fully learnable via fine-tuning, enabling frontier-level accuracy on event planning with small LoRA models.
WeatherSyn: An Instruction Tuning MLLM For Weather Forecasting Report Generation cs.CL · 2026-05-08 · unverdicted · none · ref 31
WeatherSyn is the first instruction-tuned MLLM for weather forecasting report generation, outperforming closed-source models on a new dataset of 31 US cities across 8 weather aspects.
CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning cs.AI · 2026-01-19 · unverdicted · none · ref 24
CURE-MED pairs a new 13-language medical reasoning benchmark with curriculum RL to raise logical correctness to 70% and language consistency to 95% at 32B scale while outperforming baselines.

arXiv preprint arXiv:2310.20246 , year=

fields

years

verdicts

representative citing papers

citing papers explorer