arXiv preprint arXiv:2507.10624 (2025)

· 2025 · arXiv 2507.10624

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

CircuChain: Disentangling Competence and Compliance in LLM Circuit Analysis

cs.SE · 2026-01-29 · unverdicted · novelty 7.0

Stronger LLMs show near-perfect physical reasoning in circuits but violate explicit sign and polarity instructions in trap setups, while weaker models follow instructions better but reason less accurately.

Agentic Frameworks for Reasoning Tasks: An Empirical Study

cs.AI · 2026-04-17 · unverdicted · novelty 6.0

An empirical evaluation of 22 agentic frameworks on BBH, GSM8K, and ARC benchmarks shows stable performance in 12 frameworks but highlights orchestration failures and weaker mathematical reasoning.

citing papers explorer

Showing 2 of 2 citing papers.

CircuChain: Disentangling Competence and Compliance in LLM Circuit Analysis cs.SE · 2026-01-29 · unverdicted · none · ref 20
Stronger LLMs show near-perfect physical reasoning in circuits but violate explicit sign and polarity instructions in trap setups, while weaker models follow instructions better but reason less accurately.
Agentic Frameworks for Reasoning Tasks: An Empirical Study cs.AI · 2026-04-17 · unverdicted · none · ref 61
An empirical evaluation of 22 agentic frameworks on BBH, GSM8K, and ARC benchmarks shows stable performance in 12 frameworks but highlights orchestration failures and weaker mathematical reasoning.

arXiv preprint arXiv:2507.10624 (2025)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer