ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery
Pith reviewed 2026-05-10 17:43 UTC · model grok-4.3
The pith
ScheMatiQ converts natural-language research questions over document collections into schemas and structured databases that support real-world analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ScheMatiQ leverages calls to a backbone LLM to take a question and a corpus to produce a schema and a grounded database, with a web interface that lets users steer and revise the extraction, and in collaboration with domain experts this yields outputs that support real-world analysis in law and computational biology.
What carries the argument
Interactive schema discovery process that combines LLM-generated schemas and extractions with user steering through a web interface to produce a grounded database.
Load-bearing premise
LLM-generated schemas and extractions, even after interactive user steering, produce sufficiently accurate and complete structured data to support genuine domain analysis without introducing critical errors or omissions.
What would settle it
A case in law or computational biology where the structured database produced by ScheMatiQ leads experts to an incorrect conclusion that they can verify and correct by direct reference to the original documents.
Figures
read the original abstract
Many disciplines pose natural-language research questions over large document collections whose answers typically require structured evidence, traditionally obtained by manually designing an annotation schema and exhaustively labeling the corpus, a slow and error-prone process. We introduce ScheMatiQ, which leverages calls to a backbone LLM to take a question and a corpus to produce a schema and a grounded database, with a web interface that lets steer and revise the extraction. In collaboration with domain experts, we show that ScheMatiQ yields outputs that support real-world analysis in law and computational biology. We release ScheMatiQ as open source with a public web interface, and invite experts across disciplines to use it with their own data. All resources, including the website, source code, and demonstration video, are available at: www.ScheMatiQ-ai.com
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces ScheMatiQ, a system that takes a natural-language research question and document corpus as input, uses calls to a backbone LLM to produce a schema and grounded database, and provides a web interface allowing users to interactively steer and revise the extraction process. Through collaborations with domain experts, the authors claim that the resulting outputs support real-world analysis in law and computational biology. The tool is released as open source with a public web interface and demonstration resources.
Significance. If the interactive steering mechanism reliably produces accurate, complete structured data without critical omissions or errors, ScheMatiQ could meaningfully reduce the manual effort required for schema design and annotation in document-heavy fields. The open-source release, public interface, and explicit invitation for experts to apply it to their own data are concrete strengths that could accelerate adoption and community feedback. However, the lack of any quantitative evaluation makes it impossible to assess whether these benefits are realized in practice.
major comments (2)
- [Abstract] Abstract: The central claim that 'in collaboration with domain experts, we show that ScheMatiQ yields outputs that support real-world analysis in law and computational biology' is presented without any supporting quantitative evidence such as precision/recall, error rates, inter-expert agreement scores, gold-standard comparisons, or descriptions of the specific analyses enabled and errors corrected. This absence directly undermines evaluation of whether user steering mitigates known LLM failure modes in technical domains.
- [System and Evaluation sections] System and Evaluation sections: The manuscript describes the web interface for steering but supplies no case-study details, error analysis, or before/after comparisons showing how interactive revisions addressed LLM-specific issues (e.g., hallucinations, incomplete extractions, or domain-specific inaccuracies) in the law or computational biology examples.
Simulated Author's Rebuttal
We thank the referee for the positive recognition of ScheMatiQ's potential impact, the open-source release, and the invitation for community use. We agree that the current manuscript would benefit from greater transparency regarding the nature and limitations of the evaluation, and we will revise accordingly to strengthen the presentation of the case studies.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'in collaboration with domain experts, we show that ScheMatiQ yields outputs that support real-world analysis in law and computational biology' is presented without any supporting quantitative evidence such as precision/recall, error rates, inter-expert agreement scores, gold-standard comparisons, or descriptions of the specific analyses enabled and errors corrected. This absence directly undermines evaluation of whether user steering mitigates known LLM failure modes in technical domains.
Authors: We agree that the abstract claim is stated without quantitative metrics. The collaborations with domain experts provided qualitative validation: experts reviewed the generated schemas and extracted data for their specific research questions and confirmed that the outputs were sufficiently accurate and complete to support downstream analysis in their fields. No formal precision/recall or inter-annotator agreement scores were computed. We will revise the abstract to explicitly characterize the evaluation as qualitative expert validation and will add a short description of the types of analyses enabled and the classes of LLM errors that steering helped correct. revision: yes
-
Referee: [System and Evaluation sections] System and Evaluation sections: The manuscript describes the web interface for steering but supplies no case-study details, error analysis, or before/after comparisons showing how interactive revisions addressed LLM-specific issues (e.g., hallucinations, incomplete extractions, or domain-specific inaccuracies) in the law or computational biology examples.
Authors: We acknowledge that the current manuscript provides only high-level descriptions of the law and computational-biology use cases. In the revised version we will expand the System and Evaluation sections with concrete case-study details drawn from the expert collaborations. These additions will include: (1) specific examples of LLM hallucinations or incomplete extractions that occurred, (2) the user steering actions taken via the interface to correct them, and (3) before/after comparisons of the resulting schemas and grounded data. This will make explicit how the interactive mechanism mitigates the failure modes mentioned. revision: yes
Circularity Check
No circularity: descriptive system paper with no derivations or self-referential predictions
full rationale
The paper describes an interactive LLM-based tool for schema discovery and data extraction from documents, evaluated via domain-expert collaboration in law and biology. No mathematical derivations, equations, fitted parameters, or 'predictions' appear in the provided text. The central claim (that outputs support real-world analysis) is presented as an empirical demonstration rather than a reduction to prior inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked to justify core results. This matches the default expectation for non-circular system papers.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large language models can be effectively prompted to discover schemas and extract structured information from document collections when provided with a natural language question.
Reference graph
Works this paper leans on
-
[1]
USC CLASS Research Paper, (2519)
Are trump judges different? evidence from immigration cases.Evidence from Im- migration Cases (September 15, 2025). USC CLASS Research Paper, (2519). Benjamin Newman, Yoonjoo Lee, Aakanksha Naik, Pao Siangliulue, Raymond Fok, Juho Kim, Daniel S Weld, Joseph Chee Chang, and Kyle Lo
work page 2025
-
[2]
ArxivDI- GESTables: Synthesizing scientific literature into tables using language models. InProceedings of the 2024 Conference on Empirical Methods in Natu- ral Language Processing, pages 9612–9631, Miami, Florida, USA. Association for Computational Lin- guistics. OpenAI
work page 2024
-
[3]
Intent- aware schema generation and refinement for litera- ture review tables. InFindings of the Association for Computational Linguistics: EMNLP 2025, pages 23450–23472, Suzhou, China. Association for Com- putational Linguistics. Ehud Reiter
work page 2025
-
[4]
To cot or not to cot? chain-of-thought helps mainly on math and symbolic reasoning.ArXiv preprint, abs/2409.12183. Gemini Team
-
[5]
Trans- formers: State-of-the-art natural language processing. InProceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics. Xueqing Wu, Jiacheng Zhang, and Hang Li
work page 2020
-
[6]
Molecular Biology of the Cell, 23(18):3673–3676
Nesdb: a database of nes-containing crm1 cargoes. Molecular Biology of the Cell, 23(18):3673–3676. A Use Cases: Full Specifications Legal Domain Dataset.Court decisions of U.S. court cases concerning immigration policies and injunction proceedings. Full Query.Do federal judges appointed by dif- ferent Presidents (Trump vs. other Republican vs. Democratic)...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.