pith. sign in

In: Findings of the Association for Computational Linguistics: ACL 2023

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

verdicts

UNVERDICTED 13

roles

background 2

polarities

background 1 support 1

clear filters

representative citing papers

Beyond Temperature: Hyperfitting as a Late-Stage Geometric Expansion

cs.CL · 2026-05-21 · unverdicted · novelty 6.0

Hyperfitting improves LLM generation via context-dependent rank reordering from geometric expansion in the terminal transformer block, distinct from temperature scaling, and enables efficient Late-Stage LoRA fine-tuning.

Reasoning-Aware Training for Time Series Forecasting

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

STRIDE injects distilled LLM reasoning as continuous cross-modal priors into TSFMs via mean-pooled hidden states, achieving SOTA forecasting (0.674 MASE, 0.454 CRPS) on GIFT-Eval and superior reasoning on TFRBench.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Reasoning-Aware Training for Time Series Forecasting cs.LG · 2026-05-09 · unverdicted · none · ref 1

    STRIDE injects distilled LLM reasoning as continuous cross-modal priors into TSFMs via mean-pooled hidden states, achieving SOTA forecasting (0.674 MASE, 0.454 CRPS) on GIFT-Eval and superior reasoning on TFRBench.

  • Skill-SD: Skill-Conditioned Self-Distillation for Multi-turn LLM Agents cs.LG · 2026-04-12 · unverdicted · none · ref 11

    Skill-SD turns an agent's completed trajectories into dynamic natural-language skills that condition only the teacher in self-distillation, yielding 14-42% gains over RL and OPSD baselines on multi-turn agent benchmarks.