pith. machine review for the scientific record. sign in

arxiv: 1705.04146 · v3 · submitted 2017-05-11 · 💻 cs.AI · cs.CL· cs.LG

Recognition: unknown

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

Authors on Pith no claims yet
classification 💻 cs.AI cs.CLcs.LG
keywords answerrationalesproblemsprogramsalgebraicarithmeticfinalinducing
0
0 comments X
read the original abstract

Solving algebraic word problems requires executing a series of arithmetic operations---a program---to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PAL: Program-aided Language Models

    cs.CL 2022-11 conditional novelty 8.0

    PAL improves few-shot reasoning accuracy by having LLMs generate executable programs rather than text-based chains of thought, outperforming much larger models on math and logic benchmarks.

  2. Preserving Long-Tailed Expert Information in Mixture-of-Experts Tuning

    cs.LG 2026-04 unverdicted novelty 7.0

    A new SFT framework for MoE models combines bias-driven sparsification with gated condenser experts to retain long-tailed expert information, outperforming DenseMixer and ESFT by over 2.5% on math reasoning and common...

  3. GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

    cs.LG 2024-10 accept novelty 7.0

    LLMs display high variance and major accuracy drops on GSM-Symbolic variants of grade-school math problems, indicating they replicate training patterns rather than execute logical reasoning.

  4. Large Language Models as Optimizers

    cs.LG 2023-09 unverdicted novelty 7.0

    Large language models can optimize by being prompted with histories of past solutions and scores to propose better ones, producing prompts that raise accuracy up to 8% on GSM8K and 50% on Big-Bench Hard over human-des...

  5. Towards Understanding Sycophancy in Language Models

    cs.CL 2023-10 conditional novelty 6.0

    Sycophancy is prevalent in state-of-the-art AI assistants and is likely driven in part by human preferences that favor agreement over truthfulness.