DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations

Chunbao Zhou; Jue Wang; Lei Lin; Ronghao Wang; Yangang Wang

arxiv: 2606.08532 · v4 · pith:SVMPNIBAnew · submitted 2026-06-07 · 💻 cs.AI

DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations

Lei Lin , Ronghao Wang , Chunbao Zhou , Jue Wang , Yangang Wang This is my paper

Pith reviewed 2026-06-27 18:58 UTC · model grok-4.3

classification 💻 cs.AI

keywords hypothesis generationlarge language modelsscientific explanationdeductive-nomological modeltransformer algorithmsAI for sciencecausal process

0 comments

The pith

A workflow that applies the structure of scientific explanations lets large language models generate hypotheses that outperform those from direct prompting and can be turned into working algorithms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DN-Hypo-Pipeline, a framework that operationalizes three accounts of scientific explanation to direct how an LLM generates hypotheses from an explanandum. The system abstracts universals from the phenomenon's formation process, retrieves laws relating those universals, and deductively reconstructs a new testable explanation instead of recombining patterns from existing literature. When evaluated in data-science modeling by both LLMs and human experts, the resulting hypotheses score higher than those produced by direct prompting. The two highest-scoring hypotheses were implemented as novel algorithms, one that lowers the Transformer's theoretical complexity with only minimal performance loss and another that reaches competitive accuracy using substantially fewer parameters.

Core claim

The DN-Hypo-Pipeline adopts a layered scaffold in which Hempel's deductive-nomological model supplies the output form and deductive validity, Salmon's causal-process account organizes the search for governing laws, and Armstrong's view of laws as relations between universals bridges from a phenomenon's constituent processes to candidate laws. Given an explanandum, the workflow abstracts the universals instantiated in the formation process, retrieves the laws that relate those universals, and deductively reconstructs a new, testable explanation. Hypotheses produced this way significantly outperform those from direct prompting in expert and LLM judgments, and the two top hypotheses were transl

What carries the argument

The DN-Hypo-Pipeline, a layered explanation-theoretic scaffold that combines Hempel's DN model for hypothesis form, Salmon's causal-process account for search constraints, and Armstrong's universals-relations view to connect processes to laws, so that the LLM abstracts universals, retrieves laws, and deductively generates new hypotheses.

If this is right

Hypotheses generated through the principled reasoning significantly outperform those from direct prompting when judged by both LLMs and human experts.
The two highest-scoring hypotheses were translated into novel algorithms, one reducing the Transformer's theoretical complexity with only minimal performance loss and another achieving competitive accuracy with substantially fewer parameters.
The framework searches the space of principles that govern a phenomenon rather than the space of what has already been written.
The layered scaffold supplies output form and deductive validity from Hempel, search constraints from Salmon, and the bridge from processes to laws from Armstrong.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same workflow could be tested on phenomena outside data science, such as physical or biological systems, to see whether the philosophical scaffolding transfers.
If the deductive reconstructions remain valid across domains, the approach might support iterative refinement where a generated hypothesis is tested and the results fed back to update the universals or laws.
One could examine whether the method produces hypotheses that are more readily falsifiable in experiment than those from unguided prompting.

Load-bearing premise

The three cited philosophical accounts of explanation can be operationalized into an LLM workflow that reliably abstracts universals from a phenomenon's formation process, retrieves governing laws, and produces deductively valid novel hypotheses.

What would settle it

A side-by-side evaluation in which human experts or LLMs rate direct-prompt hypotheses as equal to or better than those from the DN-Hypo-Pipeline, or in which the two translated Transformer algorithms fail to show the claimed reductions in complexity or parameter count while preserving accuracy.

read the original abstract

Modern artificial intelligence excels at prediction but cannot explain. From large language models to AI-for-science systems, today's machines answer what by recombining patterns already present in the human literature, yet they cannot reason out why a phenomenon must arise from underlying principles even though explanation, not prediction, lies at the heart of scientific discovery. Here we ask whether the structure of scientific explanation can be operationalized to guide how a machine generates hypotheses. We introduce DN-Hypo-Pipeline, a hypothesis-generation framework that adopts a layered, explanation-theoretic scaffold: Hempel's deductive-nomological (DN) model supplies the output form and deductive validity of a hypothesis, Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws, and Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws that may be associated with it. Rather than searching the space of what has been written, the framework searches the space of what principles govern a phenomenon: given an explanandum, it abstracts the universals instantiated in the phenomenon's formation process, retrieves the laws relating those universals, and deductively reconstructs a new, testable explanation. Evaluated in data-science modeling and judged by both LLMs and human experts, hypotheses generated through this principled reasoning significantly outperform those from direct prompting. Crucially, we translated the two highest-scoring hypotheses into novel algorithms one that reduces the Transformer's theoretical complexity with only minimal performance loss, and another that achieves competitive accuracy with substantially fewer parameters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a three-layer philosophical scaffold for LLM hypothesis generation but supplies zero evidence, metrics, or implementation details to support its performance claims or new algorithms.

read the letter

The core move here is to replace raw prompting with a structured workflow that pulls from Hempel's DN model for deductive form, Salmon for causal search constraints, and Armstrong for universals-to-laws mapping. That specific combination is new on the page and gives a clearer target than generic chain-of-thought. The intent to search principle space rather than text space is also a step in the right direction for AI-for-science work.

The problems start immediately with the claims. The abstract states that the generated hypotheses outperform direct prompting and that two of them were turned into concrete new transformer algorithms with measurable gains in complexity or parameter count. None of that is backed by numbers, baselines, controls, or even a description of how the translation from hypothesis to algorithm was done. Without those, the outperformance and the novelty of the algorithms remain assertions.

A deeper issue is the DN requirement itself. The pipeline is said to produce deductively valid hypotheses by abstracting universals, retrieving laws, and reconstructing explanations. LLMs do not perform formal deduction; they generate plausible text. The paper gives no indication of a logic engine, soundness checker, or even post-hoc verification step that would turn the LLM output into something that actually follows from the premises. That gap makes the central performance advantage rest on an untested assumption.

This is still early-stage work aimed at people building principle-guided generation systems. It does not yet contain enough substance for a serious referee to evaluate the technical claims or the philosophical operationalization. I would not send it to peer review until the implementation, evaluation protocol, and any formal checks are added.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the DN-Hypo-Pipeline, a hypothesis-generation framework that operationalizes Hempel's deductive-nomological (DN) model for output form and deductive validity, Salmon's causal-process account for searching governing laws, and Armstrong's universals-relations view for bridging phenomena to laws. Given an explanandum, the pipeline abstracts universals from formation processes, retrieves laws, and deductively reconstructs novel testable explanations. It claims that this yields hypotheses that significantly outperform direct prompting in data-science modeling (judged by LLMs and humans) and that the two highest-scoring hypotheses were translated into novel algorithms reducing Transformer complexity with minimal performance loss and achieving competitive accuracy with fewer parameters.

Significance. If the claims hold with demonstrated deductive validity and quantitative support, the work could advance AI-for-science by providing a structured, philosophy-grounded alternative to pattern-recombination approaches in LLMs. The explicit linkage of three philosophical accounts to a concrete workflow and the downstream translation of hypotheses into algorithms are potential strengths if rigorously evidenced.

major comments (2)

[Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.
[Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.

minor comments (1)

[Abstract] Abstract: The sentence 'we translated the two highest-scoring hypotheses into novel algorithms one that reduces...' lacks punctuation (e.g., a colon or period) for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive referee report. We address each major comment below. Where the comments identify gaps in the abstract or need for clarification on the framework, we will revise accordingly while preserving the core contributions.

read point-by-point responses

Referee: [Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.

Authors: We agree the abstract presents the claims at a high level. The full manuscript (Sections 4–6) reports the experimental protocol, including LLM and human evaluation metrics for hypothesis quality, direct-prompting baselines, controls for prompt length and temperature, and statistical tests (e.g., paired t-tests) showing significant differences. The algorithm translations include complexity analysis and accuracy comparisons on standard benchmarks. We will revise the abstract to include concise quantitative summaries and pointers to these sections so the claims can be assessed from the abstract alone. revision: yes
Referee: [Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.

Authors: The observation is accurate: the pipeline implements the DN structure through LLM-guided natural-language steps rather than a formal theorem prover or entailment verifier. We will revise the abstract and framework sections to state explicitly that the workflow operationalizes the DN model heuristically—structuring prompts to encourage deductive reconstruction—while acknowledging that it does not guarantee formal logical validity. This change clarifies the distinction between philosophical inspiration and formal proof without altering the reported empirical results. revision: yes

Circularity Check

0 steps flagged

No circularity: framework operationalizes external philosophical accounts into empirical workflow

full rationale

The paper's derivation chain consists of defining a hypothesis-generation pipeline that adopts Hempel's DN model for deductive form, Salmon's account for search constraints, and Armstrong's universals for bridging processes to laws, then applies this scaffold via LLMs to abstract universals, retrieve laws, and reconstruct explanations from an explanandum. Evaluation compares generated hypotheses against direct prompting baselines, with downstream translation to algorithms presented as empirical outcomes. No equations, fitted parameters, or self-citations appear in the provided text; the central claims rest on the operationalization of externally cited philosophical sources rather than any reduction of outputs to inputs by construction. The workflow is self-contained against external benchmarks (LLM and human expert judgments) without load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 1 invented entities

The central claim rests on the operationalization of three externally cited philosophical models into an LLM workflow; no free parameters are mentioned in the abstract.

axioms (3)

domain assumption Hempel's deductive-nomological model supplies the output form and deductive validity of a hypothesis
Adopted as the scaffold for hypothesis output form.
domain assumption Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws
Used to direct the search space.
domain assumption Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws
Provides the mechanism linking processes to laws.

invented entities (1)

DN-Hypo-Pipeline no independent evidence
purpose: Layered explanation-theoretic scaffold for LLM hypothesis generation
Newly introduced workflow that combines the three philosophical accounts.

pith-pipeline@v0.9.1-grok · 5823 in / 1514 out tokens · 25167 ms · 2026-06-27T18:58:32.027765+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

116 extracted references · 28 canonical work pages

[1]

Additionally, Figs

Experimental Results Analysis and Discussion The experiments show that not only can LLMs propose hypotheses, but also the best hypotheses generated by DN -Hypo-Pipeline improved the aggregate sum of scores by an average of more than 6 points (53-46.33=6.67) across a total of 80 scores (4 LLMs * total 20 scores), as outlined in Table 6. Additionally, Figs....
[2]

profound and self-consistent

Limitations and Conclusions The limitations of our approach largely parallel the limitations of LLMs. When using an LLM to generate open-ended ontologies, such as when generating universals and law s, hallucinations can manifest in a particularly stubborn and intractable form. Hence, when a model is required to construct a complet e conceptual system from...
[3]

https://en.wikipedia.org/wiki/Scientific_method

scientific method, (n.d.). https://en.wikipedia.org/wiki/Scientific_method
[4]

https://plato.stanford.edu/entries/scientific-method/

scientific-method, (n.d.). https://plato.stanford.edu/entries/scientific-method/
[5]

Hempel, Philosophy of Natural Science, Prentice Hall, 1966

C.G. Hempel, Philosophy of Natural Science, Prentice Hall, 1966

1966
[6]

S. Ren, P. Jian, Z. Ren, C. Leng, C. Xie, J. Zhang, Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, (2025). http://arxiv.org/abs/2503.24047

arXiv 2025
[7]

Author contributions H.N

J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, P. Sirkovic, A. Myaskovsky, G. Glowaty, F. Weissenberger, A. Orlandi, D. Popovici, A. Palepu, K. Rong, R. Tanno, K. Saab, F. Zhang, J. Blum, A. Carroll, K. Kulkarni, N. Tomašev, D. Zverinski, I. Rendulic, E. Vedadi, F. Hasler, L. Rimanic, M. Boia, I. Budiselic, B. Feinstein, M. Bellaiche, T. Sheffer, J. Freyberg,...

work page doi:10.1038/s41586-026-10644-y 2026
[8]

C. Lu, C. Lu, R.T. Lange, Y. Yamada, S. Hu, J. Foerster, D. Ha, J. Clune, Towards end -to-end automation of AI research, Nature 651 (2026) 914–919. https://doi.org/10.1038/s41586-026-10265-5

work page doi:10.1038/s41586-026-10265-5 2026
[9]

Z. Wang, B. Danek, Z. Yang, Z. Chen, J. Sun, Can Large Language Models Replace Data Scientists in Clinical Research?, Arxiv (2024) 1–28

2024
[10]

Sprueill, C

H.W. Sprueill, C. Edwards, K. Agarwal, M. V Olarte, U. Sanyal, C. Johnston, H. Liu, H. Ji, S. Choudhury, CHEMREASONER: heuristic search over a large language model ’s knowledge space using quantum-chemical feedback, in: Proc. 41st Int. Conf. Mach. Learn., JMLR.org, 2024

2024
[11]

C. Cao, X. Cao, M. Cashman, M. Kumar, A. Timoshenko, J. Yang, S. Yu, J. Zhang, Y. Zhu, B. Wernerfelt, How do successful scholars get their best research ideas? An exploration, Mark. Lett. 30 (2019) 221–232. https://www.jstor.org/stable/48701541

arXiv 2019
[12]

Salmon, W.C

W.C. Salmon, W.C. Salmon, Scientific Explanation and the Causal Structure of the World, Princeton University Press, Princeton, 2020. https://doi.org/doi:10.1515/9780691221489

work page doi:10.1515/9780691221489 2020
[13]

Salmon, Causality and Explanation: A Reply to Two Critiques, Philos

W.C. Salmon, Causality and Explanation: A Reply to Two Critiques, Philos. Sci. 64 (1997) 461–

1997
[14]

http://www.jstor.org/stable/188320 (accessed June 23, 2026)

2026
[15]

ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos

D.M. ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos. Top. 13 (1982) 7–24. http://www.jstor.org/stable/43153907

arXiv 1982
[16]

H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V

A. Karpatne, G. Atluri, J.H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly, S. Shekhar, N. Samatova, V. Kumar, Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Trans. Knowl. Data Eng. 29 (2017) 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168

work page doi:10.1109/tkde.2017.2720168 2017
[17]

Ciucă, Y.-S

I. Ciucă, Y.-S. Ting, S. Kruk, K. Iyer, Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy, (2023). http://arxiv.org/abs/2306.11648

arXiv 2023
[18]

O’Brien, J

T. O’Brien, J. Stremmel, L. Pio-Lopez, P. McMillen, C. Rasmussen-Ivey, M. Levin, Machine learning for hypothesis generation in biology and medicine: exploring the latent space of neuroscience and developmental bioelectricity, Digit. Discov. 3 (2024) 249–263. https://doi.org/https://doi.org/10.1039/d3dd00185g

work page doi:10.1039/d3dd00185g 2024
[19]

B. Qi, K. Zhang, K. Tian, H. Li, Z.-R. Chen, S. Zeng, E. Hua, H. Jinfang, B. Zhou, Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation, (2024). http://arxiv.org/abs/2407.08940

arXiv 2024
[20]

Radensky, S

M. Radensky, S. Shahid, R. Fok, P. Siangliulue, T. Hope, D.S. Weld, Scideator: Human -LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation, in: Proc. ACM Conf. AI Agentic Syst., Association for Computing Machinery, New York, NY, USA, 2026: pp. 348–374. https://doi.org/10.1145/3786335.3813161

work page doi:10.1145/3786335.3813161 2026
[21]

doi:10.18653/v1/2025.naacl-long.342 , url=

J. Baek, S.K. Jauhar, S. Cucerzan, S.J. Hwang, ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models, Proc. 2025 Annu. Conf. Nations Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Long Pap. NAACL -HLT 2025 1 (2025) 6709–6738. https://doi.org/10.18653/v1/2025.naacl-long.342

work page doi:10.18653/v1/2025.naacl-long.342 2025
[22]

O’Neill, T

C. O’Neill, T. Ghosal, R. Răileanu, M. Walmsley, T. Bui, K. Schawinski, I. Ciucă, Sparks of Science: Hypothesis Generation Using Structured Paper Data, (2025). http://arxiv.org/abs/2504.12976

arXiv 2025
[23]

R. Li, L. Jing, C. Han, J. Zhou, X. Du, Learning to Generate Research Idea with Dynamic Control, (2024). http://arxiv.org/abs/2412.14626

arXiv 2024
[24]

Afonja, I

T. Afonja, I. Sheth, R. Binkyte, W. Hanif, T. Ulas, M. Becker, M. Fritz, LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation, (2024). http://arxiv.org/abs/2410.15828

arXiv 2024
[25]

& Buehler, M

A. Ghafarollahi, M.J. Buehler, SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Adv. Mater. 37 (2025) 2413523. https://doi.org/https://doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025
[26]

Xiong, E

G. Xiong, E. Xie, A.H. Shariatmadari, S. Guo, S. Bekiranov, A. Zhang, Imp roving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models, (2024). http://arxiv.org/abs/2411.02382

arXiv 2024
[27]

C. Si, D. Yang, T. Hashimoto, Can LLMs Generate Novel Research Ideas? A Large -Scale Human Study with 100+ NLP Researchers, in: Y. Yue, A. Garg, N. Peng, F. Sha, R. Yu (Eds.), Int. Conf. Learn. Represent., 2025: pp. 94003–94092. https://proceedings.iclr.cc/paper_files/paper/2025/file/ea94957d81b1c1caf87ef5319fa6b467 -Paper-Conference.pdf

2025
[28]

Q. Wang, D. Downey, H. Ji, T. Hope, {S}ci{MON}: Scientific Inspiration Machines Optimized for Novelty, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 279–299. https://doi.org/10.18653/v1/2024.acl-long.18

work page doi:10.18653/v1/2024.acl-long.18 2024
[29]

Z. Yang, X. Du, J. Li, J. Zheng, S. Poria, E. Cambria, Large language models for automated open-domain scientific hypotheses discovery, in: Find. Assoc. Comput. Linguist. ACL 2024, 2024: pp. 13545–13565

2024
[30]

Y. Pu, T. Lin, H. Chen, PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration, (2025). http://arxiv.org/abs/2505.15047

arXiv 2025
[31]

Y. Pu, T. Lin, H. Chen, Principle-Evolvable Scientific Discovery via Uncertainty Minimization, (2026). http://arxiv.org/abs/2602.06448

Pith/arXiv arXiv 2026
[32]

R. Vasu, C. Basu, B. Dalvi Mishra, C. Sarasua, P. Clark, A. Bernstein, {H}yp{ER}: Literature-grounded Hypothesis Generation and Distillation with Provenance, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Suzhou, China, 2025: pp. 25413–2543...

work page doi:10.18653/v1/2025.emnlp-main.1292 2025
[33]

Z. Yang, W. Liu, B. Gao, T. Xie, Y. Li, W. Ouyang, S. Poria, E. Cambria, D. Zhou, {MOOSE}-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses, in: Thirteen. Int. Conf. Learn. Represent., 2025. https://openreview.net/forum?id=X9OfMNNepI

2025
[34]

Y. Liu, Z. Yang, T. Xie, J. Ni, B. Gao, Y. Li, S. Tang, W. Ouyang, E. Cambria, D. Zhou, ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition, (2025). http://arxiv.org/abs/2503.21248

Pith/arXiv arXiv 2025
[35]

https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/

Scientific Explanation, (n.d.). https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/
[36]

Hempel, P

C.G. Hempel, P. Oppenheim, Studies in the Logic of Explanation, Philos. Sci. 15 (1948) 135 –175. http://www.jstor.org/stable/185169 (accessed June 23, 2026)

1948
[37]

S. Yao, D. Yu, J. Zhao, I. Shafran, T.L. Griffiths, Y. Cao, K. Narasimhan, Tree of Thoughts: Deliberate Problem Solving with Large Language Models, Adv. Neural Inf. Process. Syst. 36 (2023) 1–14

2023
[38]

Cooper, How to write an original research paper (and get it published)., J

I.D. Cooper, How to write an original research paper (and get it published)., J. Med. Libr. Assoc. 103 (2015) 67–68. https://doi.org/10.3163/1536-5050.103.2.001

work page doi:10.3163/1536-5050.103.2.001 2015
[39]

Sollaci, M.G

L.B. Sollaci, M.G. Pereira, The introduction, methods, results, and discussion (IMRAD) structure: a fifty -year survey., J. Med. Libr. Assoc. 92 (2004) 364–367

2004
[40]

R. Arp, B. Smith, A.D. Spear, Building Ontologies with Basic Formal Ontology, The MIT Press,
[41]

http://www.jstor.org/stable/j.ctt17kk7vw
[42]

Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

B. Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

2012
[43]

https://doi.org/10.1111/j.1467-9329.2012.00557.x

work page doi:10.1111/j.1467-9329.2012.00557.x 2012
[44]

J. Li, H. Yu, X. Luo, Q. Liu, {COSIGN}: Contextual Facts Guided Generation for Knowledge Graph Completion, in: K. Duh, H. Gomez, S. Bethard (Eds.), Proc. 2024 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. (Volume 1 Long Pap., Association for Computational Linguistics, Mexico City, Mexico, 2024: pp. 1669–1682. https://doi.org/10.1865...

work page doi:10.18653/v1/2024.naacl-long.93 2024
[45]

S. Toro, A. V Anagnostopoulos, S.M. Bello, K. Blumberg, R. Cameron, L. Carmody, A.D. Diehl, D.M. Dooley, W.D. Duncan, P. Fey, P. Gaudet, N.L. Harris, M.P. Joachimiak, L. Kiani, T. Lubiana, M.C. Munoz-Torres, S. O‘Neil, D. Osumi-Sutherland, A. Puig-Barbe, J.T. Reese, L. Reiser, S.M.C. Robb, T. Ruemping, J. Seager, E. Sid, R. Stefancsik, M. Weber, V. Wood, ...

work page doi:10.1186/s13326-024-00320-3 2024
[46]

J. Gu, X. Jiang, Z. Shi, H. Tan, X. Zhai, C. Xu, W. Li, Y. Shen, S. Ma, H. Liu, S. Wang, K. Zhang, Z. Lin, B. Zhang, L. Ni, W. Gao, Y. Wang, J. Guo, A survey on LLM-as-a-judge, Innov. 7 (2026) 101253. https://doi.org/https://doi.org/10.1016/j.xinn.2025.101253

work page doi:10.1016/j.xinn.2025.101253 2026
[47]

D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y. Jiang, C. Chen, T. Wu, K. Shu, L. Cheng, H. Liu, From Generation to Judgment: Opportunities and Challenges of {LLM}-as-a-judge, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Lin...

work page doi:10.18653/v1/2025.emnlp-main.138 2025
[48]

Z. Yue, H. Zeng, L. Shang, Y. Liu, Y. Zhang, D. Wang, Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 10331–10343. https://doi.org/10.18653/v...

work page doi:10.18653/v1/2024.acl-long.556 2024
[49]

Crisan, B

A. Crisan, B. Fiore-Gartland, M. Tory, Passing the Data Baton : A Retrospective Analysis on Data Science Work and Workers, IEEE Trans. Vis. Comput. Graph. 27 (2021) 1860 –1870. https://doi.org/10.1109/TVCG.2020.3030340

work page doi:10.1109/tvcg.2020.3030340 2021
[50]

Giordano, M.D

F.R. Giordano, M.D. Weir, A first course in mathematical modeling / Frank R. Giordano, Maurice D. Weir., Brooks/Cole Pub. Co., Monterey, CA, 1985

1985
[51]

https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

Glenn Ledder, Mathematics for the Life Sciences, Springer New York, NY, 2016. https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

work page doi:10.1007/978-1-4614-7276-6 2016
[52]

https://en.wikipedia.org/wiki/Law_(principle)

Law (principle), (n.d.). https://en.wikipedia.org/wiki/Law_(principle)
[53]

K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conf. Comput. Vis. Pattern Recognit., 2016: pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[54]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is All you Need, in: I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Adv. Neural Inf. Process. Syst., Curran Associates, Inc., 2017. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee24354...

2017
[55]

Mikolov, K

T. Mikolov, K. Chen, G.S. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space, in: Int. Conf. Learn. Represent., 2013. https://api.semanticscholar.org/CorpusID:5959482

2013
[56]

https://doi.org/10.34740/KAGGLE/DSV/7548853

arXiv.org submitters, arXiv Dataset, (2024). https://doi.org/10.34740/KAGGLE/DSV/7548853

work page doi:10.34740/kaggle/dsv/7548853 2024
[57]

Sampson, L

M. Sampson, L. Zhang, A. Morrison, N.J. Barrowman, T.J. Clifford, R.W. Platt, T.P. Klassen, D. Moher, An alternative to the hand searching gold standard: validating methodological search filters using relative recall, BMC Med. Res. Methodol. 6 (2006) 33. https://doi.org/10.1186/1471-2288-6-33

work page doi:10.1186/1471-2288-6-33 2006
[58]

https://en.wikipedia.org/wiki/Amdahl%27s_law

Amdahl’s Law, (n.d.). https://en.wikipedia.org/wiki/Amdahl%27s_law
[59]

https://en.wikipedia.org/wiki/Zipf%27s_law

Zipf’s law, (n.d.). https://en.wikipedia.org/wiki/Zipf%27s_law
[60]

Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl

R.F. Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl. Clin. Trials, John Wiley & Sons, Ltd, 2008: pp. 1–3. https://doi.org/https://doi.org/10.1002/9780471462422.eoct979

work page doi:10.1002/9780471462422.eoct979 2008
[61]

Mangiafico, Scheirer–Ray–Hare Test, in: Summ

Salvatore S. Mangiafico, Scheirer–Ray–Hare Test, in: Summ. Anal. Ext. Progr. Eval. R, 2016. https://rcompanion.org/handbook/F_14.html

2016
[62]

Nixon, A.S

M.S. Nixon, A.S. Aguado, 12 - Distance, classification and learning, in: Featur. Extr. Image Process. Comput. Vis. (Fourth Ed., Fourth Edition, Academic Press, 2020: pp. 571 –604. https://doi.org/https://doi.org/10.1016/B978-0-12-814976-8.00012-9

work page doi:10.1016/b978-0-12-814976-8.00012-9 2020
[63]

Y.-H.H. Tsai, S. Bai, M. Yamada, L.-P. Morency, R. Salakhutdinov, Transformer Dissection: An Unified Understanding for Transformer{’}s Attention via the Lens of Kernel, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proc. 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process., Association for Computational Linguistics, Hong Kon...

2019
[64]

https://doi.org/10.18653/v1/D19-1443

work page doi:10.18653/v1/d19-1443
[65]

Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J

P.N. Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J. Sci. Comput. 24 (2003) 945–954. https://doi.org/10.1137/S1064827500379690

work page doi:10.1137/s1064827500379690 2003
[66]

Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998

P.S. Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998. https://api.semanticscholar.org/CorpusID:6022157

1998
[67]

Katharopoulos, A

A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: fast autoregressive transformers with linear attention, in: Proc. 37th Int. Conf. Mach. Learn., JMLR.org, 2020

2020
[68]

Y. Chen, K. Ren, Y. Wang, Y. Fang, W. Sun, D. Li, ContiFormer: continuous -time transformer for irregular time series modeling, in: Proc. 37th Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2023

2023
[69]

Baevski, M

A. Baevski, M. Auli, Adaptive Input Representations for Neural Language Modeling, in: Int. Conf. Learn. Represent., 2019. https://openreview.net/forum?id=ByxZX20qFQ

2019
[70]

Mikolov, I

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Proc. 27th Int. Conf. Neural Inf. Process. Syst. - Vol. 2, Curran Associates Inc., Red Hook, NY, USA, 2013: pp. 3111–3119

2013
[71]

Pinter, R

Y. Pinter, R. Guthrie, J. Eisenstein, Mimicking Word Embeddings using Subword {RNN}s, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proc. 2017 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Copenhagen, Denmark, 2017: pp. 102–112. https://doi.org/10.18653/v1/D17-1010

work page doi:10.18653/v1/d17-1010 2017
[72]

R. Shu, H. Nakayama, Compressing Word Embeddings via Deep Compositional Code Learning, in: Int. Conf. Learn. Represent., 2018. https://openreview.net/forum?id=BJRZzFlRb

2018
[73]

Bojanowski, E

P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information., TACL 5 (2017) 135–146. http://dblp.uni-trier.de/db/journals/tacl/tacl5.html#BojanowskiGJM17

2017
[74]

Svenstrup, J.M

D. Svenstrup, J.M. Hansen, O. Winther, Hash embeddings for efficient word representations, in: Proc. 31st Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2017: pp. 4935–4943

2017
[75]

Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang

G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, P hysics-informed machine learning, Nat. Rev. Phys. 3 (2021) 422–440. https://doi.org/10.1038/s42254-021-00314-5. Appendix A. Continuous-Time Attention Transformer A.1 Overall Architecture CTAT (Continuous-Time Attention Transformer) Models: Instead of using positional encoding and...

work page doi:10.1038/s42254-021-00314-5 2021
[76]

Dense S oftmax, which explicitly computes 𝑂(𝐿2) attention scores and adds a Gaussian distance kernel bias
[77]

Dense Linear (ELU+1) , which is a linear form of attention that explicitly computes the kernel-weighted sum in 𝑂(𝐿2)
[78]

FFT Linear, which uses the convolution theorem and a Fast Fourier Transform to reduce the complexity of the linear attention to 𝑂(𝐿 𝑙𝑜𝑔(𝐿))
[79]

word manifold

Gauss-Legendre finite -interval approximation , which approximates the continuous -time integral using a fixed number of quadrature nodes, but is restricted to a learnable causal window, where 𝑂(𝐿𝑀),𝑀 is the number of interpolation nodes. A.2 Key Mathematical Definitions A.2.1 Gaussian Distance Kernel For positions 𝑖 and 𝑗(𝑗 ≤ 𝑖), the time distance is 𝑟 =...
[80]

[First step in the logical deduction connecting laws and conditions to the result.]

Showing first 80 references.

[1] [1]

Additionally, Figs

Experimental Results Analysis and Discussion The experiments show that not only can LLMs propose hypotheses, but also the best hypotheses generated by DN -Hypo-Pipeline improved the aggregate sum of scores by an average of more than 6 points (53-46.33=6.67) across a total of 80 scores (4 LLMs * total 20 scores), as outlined in Table 6. Additionally, Figs....

[2] [2]

profound and self-consistent

Limitations and Conclusions The limitations of our approach largely parallel the limitations of LLMs. When using an LLM to generate open-ended ontologies, such as when generating universals and law s, hallucinations can manifest in a particularly stubborn and intractable form. Hence, when a model is required to construct a complet e conceptual system from...

[3] [3]

https://en.wikipedia.org/wiki/Scientific_method

scientific method, (n.d.). https://en.wikipedia.org/wiki/Scientific_method

[4] [4]

https://plato.stanford.edu/entries/scientific-method/

scientific-method, (n.d.). https://plato.stanford.edu/entries/scientific-method/

[5] [5]

Hempel, Philosophy of Natural Science, Prentice Hall, 1966

C.G. Hempel, Philosophy of Natural Science, Prentice Hall, 1966

1966

[6] [6]

S. Ren, P. Jian, Z. Ren, C. Leng, C. Xie, J. Zhang, Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, (2025). http://arxiv.org/abs/2503.24047

arXiv 2025

[7] [7]

Author contributions H.N

J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, P. Sirkovic, A. Myaskovsky, G. Glowaty, F. Weissenberger, A. Orlandi, D. Popovici, A. Palepu, K. Rong, R. Tanno, K. Saab, F. Zhang, J. Blum, A. Carroll, K. Kulkarni, N. Tomašev, D. Zverinski, I. Rendulic, E. Vedadi, F. Hasler, L. Rimanic, M. Boia, I. Budiselic, B. Feinstein, M. Bellaiche, T. Sheffer, J. Freyberg,...

work page doi:10.1038/s41586-026-10644-y 2026

[8] [8]

C. Lu, C. Lu, R.T. Lange, Y. Yamada, S. Hu, J. Foerster, D. Ha, J. Clune, Towards end -to-end automation of AI research, Nature 651 (2026) 914–919. https://doi.org/10.1038/s41586-026-10265-5

work page doi:10.1038/s41586-026-10265-5 2026

[9] [9]

Z. Wang, B. Danek, Z. Yang, Z. Chen, J. Sun, Can Large Language Models Replace Data Scientists in Clinical Research?, Arxiv (2024) 1–28

2024

[10] [10]

Sprueill, C

H.W. Sprueill, C. Edwards, K. Agarwal, M. V Olarte, U. Sanyal, C. Johnston, H. Liu, H. Ji, S. Choudhury, CHEMREASONER: heuristic search over a large language model ’s knowledge space using quantum-chemical feedback, in: Proc. 41st Int. Conf. Mach. Learn., JMLR.org, 2024

2024

[11] [11]

C. Cao, X. Cao, M. Cashman, M. Kumar, A. Timoshenko, J. Yang, S. Yu, J. Zhang, Y. Zhu, B. Wernerfelt, How do successful scholars get their best research ideas? An exploration, Mark. Lett. 30 (2019) 221–232. https://www.jstor.org/stable/48701541

arXiv 2019

[12] [12]

Salmon, W.C

W.C. Salmon, W.C. Salmon, Scientific Explanation and the Causal Structure of the World, Princeton University Press, Princeton, 2020. https://doi.org/doi:10.1515/9780691221489

work page doi:10.1515/9780691221489 2020

[13] [13]

Salmon, Causality and Explanation: A Reply to Two Critiques, Philos

W.C. Salmon, Causality and Explanation: A Reply to Two Critiques, Philos. Sci. 64 (1997) 461–

1997

[14] [14]

http://www.jstor.org/stable/188320 (accessed June 23, 2026)

2026

[15] [15]

ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos

D.M. ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos. Top. 13 (1982) 7–24. http://www.jstor.org/stable/43153907

arXiv 1982

[16] [16]

H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V

A. Karpatne, G. Atluri, J.H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly, S. Shekhar, N. Samatova, V. Kumar, Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Trans. Knowl. Data Eng. 29 (2017) 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168

work page doi:10.1109/tkde.2017.2720168 2017

[17] [17]

Ciucă, Y.-S

I. Ciucă, Y.-S. Ting, S. Kruk, K. Iyer, Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy, (2023). http://arxiv.org/abs/2306.11648

arXiv 2023

[18] [18]

O’Brien, J

T. O’Brien, J. Stremmel, L. Pio-Lopez, P. McMillen, C. Rasmussen-Ivey, M. Levin, Machine learning for hypothesis generation in biology and medicine: exploring the latent space of neuroscience and developmental bioelectricity, Digit. Discov. 3 (2024) 249–263. https://doi.org/https://doi.org/10.1039/d3dd00185g

work page doi:10.1039/d3dd00185g 2024

[19] [19]

B. Qi, K. Zhang, K. Tian, H. Li, Z.-R. Chen, S. Zeng, E. Hua, H. Jinfang, B. Zhou, Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation, (2024). http://arxiv.org/abs/2407.08940

arXiv 2024

[20] [20]

Radensky, S

M. Radensky, S. Shahid, R. Fok, P. Siangliulue, T. Hope, D.S. Weld, Scideator: Human -LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation, in: Proc. ACM Conf. AI Agentic Syst., Association for Computing Machinery, New York, NY, USA, 2026: pp. 348–374. https://doi.org/10.1145/3786335.3813161

work page doi:10.1145/3786335.3813161 2026

[21] [21]

doi:10.18653/v1/2025.naacl-long.342 , url=

J. Baek, S.K. Jauhar, S. Cucerzan, S.J. Hwang, ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models, Proc. 2025 Annu. Conf. Nations Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Long Pap. NAACL -HLT 2025 1 (2025) 6709–6738. https://doi.org/10.18653/v1/2025.naacl-long.342

work page doi:10.18653/v1/2025.naacl-long.342 2025

[22] [22]

O’Neill, T

C. O’Neill, T. Ghosal, R. Răileanu, M. Walmsley, T. Bui, K. Schawinski, I. Ciucă, Sparks of Science: Hypothesis Generation Using Structured Paper Data, (2025). http://arxiv.org/abs/2504.12976

arXiv 2025

[23] [23]

R. Li, L. Jing, C. Han, J. Zhou, X. Du, Learning to Generate Research Idea with Dynamic Control, (2024). http://arxiv.org/abs/2412.14626

arXiv 2024

[24] [24]

Afonja, I

T. Afonja, I. Sheth, R. Binkyte, W. Hanif, T. Ulas, M. Becker, M. Fritz, LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation, (2024). http://arxiv.org/abs/2410.15828

arXiv 2024

[25] [25]

& Buehler, M

A. Ghafarollahi, M.J. Buehler, SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Adv. Mater. 37 (2025) 2413523. https://doi.org/https://doi.org/10.1002/adma.202413523

work page doi:10.1002/adma.202413523 2025

[26] [26]

Xiong, E

G. Xiong, E. Xie, A.H. Shariatmadari, S. Guo, S. Bekiranov, A. Zhang, Imp roving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models, (2024). http://arxiv.org/abs/2411.02382

arXiv 2024

[27] [27]

C. Si, D. Yang, T. Hashimoto, Can LLMs Generate Novel Research Ideas? A Large -Scale Human Study with 100+ NLP Researchers, in: Y. Yue, A. Garg, N. Peng, F. Sha, R. Yu (Eds.), Int. Conf. Learn. Represent., 2025: pp. 94003–94092. https://proceedings.iclr.cc/paper_files/paper/2025/file/ea94957d81b1c1caf87ef5319fa6b467 -Paper-Conference.pdf

2025

[28] [28]

Q. Wang, D. Downey, H. Ji, T. Hope, {S}ci{MON}: Scientific Inspiration Machines Optimized for Novelty, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 279–299. https://doi.org/10.18653/v1/2024.acl-long.18

work page doi:10.18653/v1/2024.acl-long.18 2024

[29] [29]

Z. Yang, X. Du, J. Li, J. Zheng, S. Poria, E. Cambria, Large language models for automated open-domain scientific hypotheses discovery, in: Find. Assoc. Comput. Linguist. ACL 2024, 2024: pp. 13545–13565

2024

[30] [30]

Y. Pu, T. Lin, H. Chen, PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration, (2025). http://arxiv.org/abs/2505.15047

arXiv 2025

[31] [31]

Y. Pu, T. Lin, H. Chen, Principle-Evolvable Scientific Discovery via Uncertainty Minimization, (2026). http://arxiv.org/abs/2602.06448

Pith/arXiv arXiv 2026

[32] [32]

R. Vasu, C. Basu, B. Dalvi Mishra, C. Sarasua, P. Clark, A. Bernstein, {H}yp{ER}: Literature-grounded Hypothesis Generation and Distillation with Provenance, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Suzhou, China, 2025: pp. 25413–2543...

work page doi:10.18653/v1/2025.emnlp-main.1292 2025

[33] [33]

Z. Yang, W. Liu, B. Gao, T. Xie, Y. Li, W. Ouyang, S. Poria, E. Cambria, D. Zhou, {MOOSE}-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses, in: Thirteen. Int. Conf. Learn. Represent., 2025. https://openreview.net/forum?id=X9OfMNNepI

2025

[34] [34]

Y. Liu, Z. Yang, T. Xie, J. Ni, B. Gao, Y. Li, S. Tang, W. Ouyang, E. Cambria, D. Zhou, ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition, (2025). http://arxiv.org/abs/2503.21248

Pith/arXiv arXiv 2025

[35] [35]

https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/

Scientific Explanation, (n.d.). https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/

[36] [36]

Hempel, P

C.G. Hempel, P. Oppenheim, Studies in the Logic of Explanation, Philos. Sci. 15 (1948) 135 –175. http://www.jstor.org/stable/185169 (accessed June 23, 2026)

1948

[37] [37]

S. Yao, D. Yu, J. Zhao, I. Shafran, T.L. Griffiths, Y. Cao, K. Narasimhan, Tree of Thoughts: Deliberate Problem Solving with Large Language Models, Adv. Neural Inf. Process. Syst. 36 (2023) 1–14

2023

[38] [38]

Cooper, How to write an original research paper (and get it published)., J

I.D. Cooper, How to write an original research paper (and get it published)., J. Med. Libr. Assoc. 103 (2015) 67–68. https://doi.org/10.3163/1536-5050.103.2.001

work page doi:10.3163/1536-5050.103.2.001 2015

[39] [39]

Sollaci, M.G

L.B. Sollaci, M.G. Pereira, The introduction, methods, results, and discussion (IMRAD) structure: a fifty -year survey., J. Med. Libr. Assoc. 92 (2004) 364–367

2004

[40] [40]

R. Arp, B. Smith, A.D. Spear, Building Ontologies with Basic Formal Ontology, The MIT Press,

[41] [41]

http://www.jstor.org/stable/j.ctt17kk7vw

[42] [42]

Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

B. Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

2012

[43] [43]

https://doi.org/10.1111/j.1467-9329.2012.00557.x

work page doi:10.1111/j.1467-9329.2012.00557.x 2012

[44] [44]

J. Li, H. Yu, X. Luo, Q. Liu, {COSIGN}: Contextual Facts Guided Generation for Knowledge Graph Completion, in: K. Duh, H. Gomez, S. Bethard (Eds.), Proc. 2024 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. (Volume 1 Long Pap., Association for Computational Linguistics, Mexico City, Mexico, 2024: pp. 1669–1682. https://doi.org/10.1865...

work page doi:10.18653/v1/2024.naacl-long.93 2024

[45] [45]

S. Toro, A. V Anagnostopoulos, S.M. Bello, K. Blumberg, R. Cameron, L. Carmody, A.D. Diehl, D.M. Dooley, W.D. Duncan, P. Fey, P. Gaudet, N.L. Harris, M.P. Joachimiak, L. Kiani, T. Lubiana, M.C. Munoz-Torres, S. O‘Neil, D. Osumi-Sutherland, A. Puig-Barbe, J.T. Reese, L. Reiser, S.M.C. Robb, T. Ruemping, J. Seager, E. Sid, R. Stefancsik, M. Weber, V. Wood, ...

work page doi:10.1186/s13326-024-00320-3 2024

[46] [46]

J. Gu, X. Jiang, Z. Shi, H. Tan, X. Zhai, C. Xu, W. Li, Y. Shen, S. Ma, H. Liu, S. Wang, K. Zhang, Z. Lin, B. Zhang, L. Ni, W. Gao, Y. Wang, J. Guo, A survey on LLM-as-a-judge, Innov. 7 (2026) 101253. https://doi.org/https://doi.org/10.1016/j.xinn.2025.101253

work page doi:10.1016/j.xinn.2025.101253 2026

[47] [47]

D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y. Jiang, C. Chen, T. Wu, K. Shu, L. Cheng, H. Liu, From Generation to Judgment: Opportunities and Challenges of {LLM}-as-a-judge, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Lin...

work page doi:10.18653/v1/2025.emnlp-main.138 2025

[48] [48]

Z. Yue, H. Zeng, L. Shang, Y. Liu, Y. Zhang, D. Wang, Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 10331–10343. https://doi.org/10.18653/v...

work page doi:10.18653/v1/2024.acl-long.556 2024

[49] [49]

Crisan, B

A. Crisan, B. Fiore-Gartland, M. Tory, Passing the Data Baton : A Retrospective Analysis on Data Science Work and Workers, IEEE Trans. Vis. Comput. Graph. 27 (2021) 1860 –1870. https://doi.org/10.1109/TVCG.2020.3030340

work page doi:10.1109/tvcg.2020.3030340 2021

[50] [50]

Giordano, M.D

F.R. Giordano, M.D. Weir, A first course in mathematical modeling / Frank R. Giordano, Maurice D. Weir., Brooks/Cole Pub. Co., Monterey, CA, 1985

1985

[51] [51]

https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

Glenn Ledder, Mathematics for the Life Sciences, Springer New York, NY, 2016. https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

work page doi:10.1007/978-1-4614-7276-6 2016

[52] [52]

https://en.wikipedia.org/wiki/Law_(principle)

Law (principle), (n.d.). https://en.wikipedia.org/wiki/Law_(principle)

[53] [53]

K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conf. Comput. Vis. Pattern Recognit., 2016: pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016

[54] [54]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is All you Need, in: I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Adv. Neural Inf. Process. Syst., Curran Associates, Inc., 2017. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee24354...

2017

[55] [55]

Mikolov, K

T. Mikolov, K. Chen, G.S. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space, in: Int. Conf. Learn. Represent., 2013. https://api.semanticscholar.org/CorpusID:5959482

2013

[56] [56]

https://doi.org/10.34740/KAGGLE/DSV/7548853

arXiv.org submitters, arXiv Dataset, (2024). https://doi.org/10.34740/KAGGLE/DSV/7548853

work page doi:10.34740/kaggle/dsv/7548853 2024

[57] [57]

Sampson, L

M. Sampson, L. Zhang, A. Morrison, N.J. Barrowman, T.J. Clifford, R.W. Platt, T.P. Klassen, D. Moher, An alternative to the hand searching gold standard: validating methodological search filters using relative recall, BMC Med. Res. Methodol. 6 (2006) 33. https://doi.org/10.1186/1471-2288-6-33

work page doi:10.1186/1471-2288-6-33 2006

[58] [58]

https://en.wikipedia.org/wiki/Amdahl%27s_law

Amdahl’s Law, (n.d.). https://en.wikipedia.org/wiki/Amdahl%27s_law

[59] [59]

https://en.wikipedia.org/wiki/Zipf%27s_law

Zipf’s law, (n.d.). https://en.wikipedia.org/wiki/Zipf%27s_law

[60] [60]

Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl

R.F. Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl. Clin. Trials, John Wiley & Sons, Ltd, 2008: pp. 1–3. https://doi.org/https://doi.org/10.1002/9780471462422.eoct979

work page doi:10.1002/9780471462422.eoct979 2008

[61] [61]

Mangiafico, Scheirer–Ray–Hare Test, in: Summ

Salvatore S. Mangiafico, Scheirer–Ray–Hare Test, in: Summ. Anal. Ext. Progr. Eval. R, 2016. https://rcompanion.org/handbook/F_14.html

2016

[62] [62]

Nixon, A.S

M.S. Nixon, A.S. Aguado, 12 - Distance, classification and learning, in: Featur. Extr. Image Process. Comput. Vis. (Fourth Ed., Fourth Edition, Academic Press, 2020: pp. 571 –604. https://doi.org/https://doi.org/10.1016/B978-0-12-814976-8.00012-9

work page doi:10.1016/b978-0-12-814976-8.00012-9 2020

[63] [63]

Y.-H.H. Tsai, S. Bai, M. Yamada, L.-P. Morency, R. Salakhutdinov, Transformer Dissection: An Unified Understanding for Transformer{’}s Attention via the Lens of Kernel, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proc. 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process., Association for Computational Linguistics, Hong Kon...

2019

[64] [64]

https://doi.org/10.18653/v1/D19-1443

work page doi:10.18653/v1/d19-1443

[65] [65]

Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J

P.N. Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J. Sci. Comput. 24 (2003) 945–954. https://doi.org/10.1137/S1064827500379690

work page doi:10.1137/s1064827500379690 2003

[66] [66]

Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998

P.S. Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998. https://api.semanticscholar.org/CorpusID:6022157

1998

[67] [67]

Katharopoulos, A

A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: fast autoregressive transformers with linear attention, in: Proc. 37th Int. Conf. Mach. Learn., JMLR.org, 2020

2020

[68] [68]

Y. Chen, K. Ren, Y. Wang, Y. Fang, W. Sun, D. Li, ContiFormer: continuous -time transformer for irregular time series modeling, in: Proc. 37th Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2023

2023

[69] [69]

Baevski, M

A. Baevski, M. Auli, Adaptive Input Representations for Neural Language Modeling, in: Int. Conf. Learn. Represent., 2019. https://openreview.net/forum?id=ByxZX20qFQ

2019

[70] [70]

Mikolov, I

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Proc. 27th Int. Conf. Neural Inf. Process. Syst. - Vol. 2, Curran Associates Inc., Red Hook, NY, USA, 2013: pp. 3111–3119

2013

[71] [71]

Pinter, R

Y. Pinter, R. Guthrie, J. Eisenstein, Mimicking Word Embeddings using Subword {RNN}s, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proc. 2017 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Copenhagen, Denmark, 2017: pp. 102–112. https://doi.org/10.18653/v1/D17-1010

work page doi:10.18653/v1/d17-1010 2017

[72] [72]

R. Shu, H. Nakayama, Compressing Word Embeddings via Deep Compositional Code Learning, in: Int. Conf. Learn. Represent., 2018. https://openreview.net/forum?id=BJRZzFlRb

2018

[73] [73]

Bojanowski, E

P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information., TACL 5 (2017) 135–146. http://dblp.uni-trier.de/db/journals/tacl/tacl5.html#BojanowskiGJM17

2017

[74] [74]

Svenstrup, J.M

D. Svenstrup, J.M. Hansen, O. Winther, Hash embeddings for efficient word representations, in: Proc. 31st Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2017: pp. 4935–4943

2017

[75] [75]

Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang

G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, P hysics-informed machine learning, Nat. Rev. Phys. 3 (2021) 422–440. https://doi.org/10.1038/s42254-021-00314-5. Appendix A. Continuous-Time Attention Transformer A.1 Overall Architecture CTAT (Continuous-Time Attention Transformer) Models: Instead of using positional encoding and...

work page doi:10.1038/s42254-021-00314-5 2021

[76] [76]

Dense S oftmax, which explicitly computes 𝑂(𝐿2) attention scores and adds a Gaussian distance kernel bias

[77] [77]

Dense Linear (ELU+1) , which is a linear form of attention that explicitly computes the kernel-weighted sum in 𝑂(𝐿2)

[78] [78]

FFT Linear, which uses the convolution theorem and a Fast Fourier Transform to reduce the complexity of the linear attention to 𝑂(𝐿 𝑙𝑜𝑔(𝐿))

[79] [79]

word manifold

Gauss-Legendre finite -interval approximation , which approximates the continuous -time integral using a fixed number of quadrature nodes, but is restricted to a learnable causal window, where 𝑂(𝐿𝑀),𝑀 is the number of interpolation nodes. A.2 Key Mathematical Definitions A.2.1 Gaussian Distance Kernel For positions 𝑖 and 𝑗(𝑗 ≤ 𝑖), the time distance is 𝑟 =...

[80] [80]

[First step in the logical deduction connecting laws and conditions to the result.]