Legal2LogicICL improves accuracy and generalization when mapping legal cases to logical formulas by retrieving balanced diverse exemplars at semantic and structural levels, backed by the new Legal2Proleg dataset.
URL https://aclanthology.org/2022.naacl-main.191
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
METIS internalizes curriculum judgment in LLM reinforcement fine-tuning by predicting within-prompt reward variance via in-context learning and jointly optimizing with a self-judgment reward, yielding superior performance and up to 67% faster convergence across math, code, and agent benchmarks.
IRAP quantifies ambiguous performance requirements into mathematical functions via interactive retrieval-augmented preference elicitation and outperforms ten prior methods on four real-world datasets with up to 40x gains in five interaction rounds.
Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.
Auto-CoT automatically builds chain-of-thought demonstrations by sampling diverse questions and letting the LLM generate reasoning chains, matching manual CoT performance on ten reasoning tasks with GPT-3.
citing papers explorer
-
Legal2LogicICL: Improving Generalization in Transforming Legal Cases to Logical Formulas via Diverse Few-Shot Learning
Legal2LogicICL improves accuracy and generalization when mapping legal cases to logical formulas by retrieving balanced diverse exemplars at semantic and structural levels, backed by the new Legal2Proleg dataset.
-
Internalizing Curriculum Judgment for LLM Reinforcement Fine-Tuning
METIS internalizes curriculum judgment in LLM reinforcement fine-tuning by predicting within-prompt reward variance via in-context learning and jointly optimizing with a self-judgment reward, yielding superior performance and up to 67% faster convergence across math, code, and agent benchmarks.
-
Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation
IRAP quantifies ambiguous performance requirements into mathematical functions via interactive retrieval-augmented preference elicitation and outperforms ten prior methods on four real-world datasets with up to 40x gains in five interaction rounds.
-
Multimodal Chain-of-Thought Reasoning in Language Models
Multimodal-CoT achieves state-of-the-art on ScienceQA by using a two-stage process that incorporates vision into chain-of-thought rationale generation for models under 1 billion parameters.
-
Automatic Chain of Thought Prompting in Large Language Models
Auto-CoT automatically builds chain-of-thought demonstrations by sampling diverse questions and letting the LLM generate reasoning chains, matching manual CoT performance on ten reasoning tasks with GPT-3.