pith. sign in

hub

Chain-of-thought prompting elicits reasoning in large language models

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

roles

background 3

polarities

background 3

representative citing papers

Benchmarking LLM-Driven Network Configuration Repair

cs.NI · 2026-04-24 · unverdicted · novelty 8.0

Cornetto is the first benchmark that synthesizes 231 network misconfiguration problems across topologies of 20-754 nodes and uses formal verification to show that nine state-of-the-art LLMs often introduce regressions and degrade at scale.

Latent Chain-of-Thought Improves Structured-Data Transformers

cs.LG · 2026-05-11 · conditional · novelty 7.0 · 2 refs

Latent chain-of-thought via recurrent feedback tokens from compressed hidden states improves transformer performance on time-series forecasting and tabular prediction across 36 datasets.

BEDTime: A Unified Benchmark for Automatically Describing Time Series

cs.CL · 2025-09-05 · conditional · novelty 6.0

BEDTime benchmark tests 17 models on describing time series structure and finds vision-language models outperform dedicated time-series-language models and language-only approaches, with all models fragile to robustness tests.

Llemma: An Open Language Model For Mathematics

cs.CL · 2023-10-16 · unverdicted · novelty 6.0

Continued pretraining of Code Llama on Proof-Pile-2 yields Llemma, an open math-specialized LLM that beats known open base models on MATH and supports tool use plus formal proving out of the box.

citing papers explorer

Showing 13 of 13 citing papers.