MACE deploys three specialized LLM agents (Planner, Executor, Verifier) with zero-shot CoT to verify claims from tables, matching SOTA performance on two datasets and near-SOTA on two others using models 2-8x smaller than prior bests.
arXiv preprint arXiv:2411.02059 , year =
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
background 1polarities
background 1representative citing papers
Empirical 2x2 factorial study on 6 statistical datasets shows format and schema constraints in LLM-based KG construction from CSV tables produce super-additive fidelity loss up to +1.180, with mismatched pairs falling below baseline, plus release of CSVFidelity-Bench.
Introduces TableGrid Navigation (TGN) and Progressive Inference Prompting (PIP) as training-free structured prompting frameworks that improve LLM performance on table question answering over baselines on TableBench and achieve SOTA on FeTaQa.
Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.
SpreadsheetAgent uses incremental multi-format reading, structural sketching, and verification to raise spreadsheet benchmark accuracy from 35.27% to 38.16%.
RedParrot accelerates NL-to-DSL conversion by 3.6x with 8.26% accuracy gain on enterprise data and 34.8% on benchmarks via semantic caching of query skeletons and contrastive learning.
MachineLearningLM uses continued pretraining on SCM-synthesized ML tasks with random-forest distillation to give LLMs robust many-shot in-context learning on tabular classification, reaching random-forest accuracy levels while preserving general chat performance.
citing papers explorer
-
A Multi-Agent Approach for Claim Verification from Tabular Data Documents
MACE deploys three specialized LLM agents (Planner, Executor, Verifier) with zero-shot CoT to verify claims from tables, matching SOTA performance on two datasets and near-SOTA on two others using models 2-8x smaller than prior bests.
-
Format-Constraint Coupling in Knowledge Graph Construction from Statistical Tables
Empirical 2x2 factorial study on 6 statistical datasets shows format and schema constraints in LLM-based KG construction from CSV tables produce super-additive fidelity loss up to +1.180, with mismatched pairs falling below baseline, plus release of CSVFidelity-Bench.
-
Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting
Introduces TableGrid Navigation (TGN) and Progressive Inference Prompting (PIP) as training-free structured prompting frameworks that improve LLM performance on table question answering over baselines on TableBench and achieve SOTA on FeTaQa.
-
The Power of Order: Fooling LLMs with Adversarial Table Permutations
Semantically invariant row and column permutations in tables can cause LLMs to output incorrect answers, and a gradient-based attack called ATP efficiently finds such permutations that degrade performance across many models.
-
Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning
SpreadsheetAgent uses incremental multi-format reading, structural sketching, and verification to raise spreadsheet benchmark accuracy from 35.27% to 38.16%.
-
RedParrot: Accelerating NL-to-DSL for Business Analytics via Query Semantic Caching
RedParrot accelerates NL-to-DSL conversion by 3.6x with 8.26% accuracy gain on enterprise data and 34.8% on benchmarks via semantic caching of query skeletons and contrastive learning.
-
MachineLearningLM: Scaling Many-shot In-context Learning via Continued Pretraining
MachineLearningLM uses continued pretraining on SCM-synthesized ML tasks with random-forest distillation to give LLMs robust many-shot in-context learning on tabular classification, reaching random-forest accuracy levels while preserving general chat performance.