pith. sign in

arxiv: 2606.08532 · v4 · pith:SVMPNIBAnew · submitted 2026-06-07 · 💻 cs.AI

DN-Hypo-Pipeline: An AI-Driven Workflow for Generating Hypotheses using Large Language Models and Scientific Explanations

Pith reviewed 2026-06-27 18:58 UTC · model grok-4.3

classification 💻 cs.AI
keywords hypothesis generationlarge language modelsscientific explanationdeductive-nomological modeltransformer algorithmsAI for sciencecausal process
0
0 comments X

The pith

A workflow that applies the structure of scientific explanations lets large language models generate hypotheses that outperform those from direct prompting and can be turned into working algorithms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DN-Hypo-Pipeline, a framework that operationalizes three accounts of scientific explanation to direct how an LLM generates hypotheses from an explanandum. The system abstracts universals from the phenomenon's formation process, retrieves laws relating those universals, and deductively reconstructs a new testable explanation instead of recombining patterns from existing literature. When evaluated in data-science modeling by both LLMs and human experts, the resulting hypotheses score higher than those produced by direct prompting. The two highest-scoring hypotheses were implemented as novel algorithms, one that lowers the Transformer's theoretical complexity with only minimal performance loss and another that reaches competitive accuracy using substantially fewer parameters.

Core claim

The DN-Hypo-Pipeline adopts a layered scaffold in which Hempel's deductive-nomological model supplies the output form and deductive validity, Salmon's causal-process account organizes the search for governing laws, and Armstrong's view of laws as relations between universals bridges from a phenomenon's constituent processes to candidate laws. Given an explanandum, the workflow abstracts the universals instantiated in the formation process, retrieves the laws that relate those universals, and deductively reconstructs a new, testable explanation. Hypotheses produced this way significantly outperform those from direct prompting in expert and LLM judgments, and the two top hypotheses were transl

What carries the argument

The DN-Hypo-Pipeline, a layered explanation-theoretic scaffold that combines Hempel's DN model for hypothesis form, Salmon's causal-process account for search constraints, and Armstrong's universals-relations view to connect processes to laws, so that the LLM abstracts universals, retrieves laws, and deductively generates new hypotheses.

If this is right

  • Hypotheses generated through the principled reasoning significantly outperform those from direct prompting when judged by both LLMs and human experts.
  • The two highest-scoring hypotheses were translated into novel algorithms, one reducing the Transformer's theoretical complexity with only minimal performance loss and another achieving competitive accuracy with substantially fewer parameters.
  • The framework searches the space of principles that govern a phenomenon rather than the space of what has already been written.
  • The layered scaffold supplies output form and deductive validity from Hempel, search constraints from Salmon, and the bridge from processes to laws from Armstrong.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same workflow could be tested on phenomena outside data science, such as physical or biological systems, to see whether the philosophical scaffolding transfers.
  • If the deductive reconstructions remain valid across domains, the approach might support iterative refinement where a generated hypothesis is tested and the results fed back to update the universals or laws.
  • One could examine whether the method produces hypotheses that are more readily falsifiable in experiment than those from unguided prompting.

Load-bearing premise

The three cited philosophical accounts of explanation can be operationalized into an LLM workflow that reliably abstracts universals from a phenomenon's formation process, retrieves governing laws, and produces deductively valid novel hypotheses.

What would settle it

A side-by-side evaluation in which human experts or LLMs rate direct-prompt hypotheses as equal to or better than those from the DN-Hypo-Pipeline, or in which the two translated Transformer algorithms fail to show the claimed reductions in complexity or parameter count while preserving accuracy.

read the original abstract

Modern artificial intelligence excels at prediction but cannot explain. From large language models to AI-for-science systems, today's machines answer what by recombining patterns already present in the human literature, yet they cannot reason out why a phenomenon must arise from underlying principles even though explanation, not prediction, lies at the heart of scientific discovery. Here we ask whether the structure of scientific explanation can be operationalized to guide how a machine generates hypotheses. We introduce DN-Hypo-Pipeline, a hypothesis-generation framework that adopts a layered, explanation-theoretic scaffold: Hempel's deductive-nomological (DN) model supplies the output form and deductive validity of a hypothesis, Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws, and Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws that may be associated with it. Rather than searching the space of what has been written, the framework searches the space of what principles govern a phenomenon: given an explanandum, it abstracts the universals instantiated in the phenomenon's formation process, retrieves the laws relating those universals, and deductively reconstructs a new, testable explanation. Evaluated in data-science modeling and judged by both LLMs and human experts, hypotheses generated through this principled reasoning significantly outperform those from direct prompting. Crucially, we translated the two highest-scoring hypotheses into novel algorithms one that reduces the Transformer's theoretical complexity with only minimal performance loss, and another that achieves competitive accuracy with substantially fewer parameters.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the DN-Hypo-Pipeline, a hypothesis-generation framework that operationalizes Hempel's deductive-nomological (DN) model for output form and deductive validity, Salmon's causal-process account for searching governing laws, and Armstrong's universals-relations view for bridging phenomena to laws. Given an explanandum, the pipeline abstracts universals from formation processes, retrieves laws, and deductively reconstructs novel testable explanations. It claims that this yields hypotheses that significantly outperform direct prompting in data-science modeling (judged by LLMs and humans) and that the two highest-scoring hypotheses were translated into novel algorithms reducing Transformer complexity with minimal performance loss and achieving competitive accuracy with fewer parameters.

Significance. If the claims hold with demonstrated deductive validity and quantitative support, the work could advance AI-for-science by providing a structured, philosophy-grounded alternative to pattern-recombination approaches in LLMs. The explicit linkage of three philosophical accounts to a concrete workflow and the downstream translation of hypotheses into algorithms are potential strengths if rigorously evidenced.

major comments (2)
  1. [Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.
  2. [Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.
minor comments (1)
  1. [Abstract] Abstract: The sentence 'we translated the two highest-scoring hypotheses into novel algorithms one that reduces...' lacks punctuation (e.g., a colon or period) for readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive referee report. We address each major comment below. Where the comments identify gaps in the abstract or need for clarification on the framework, we will revise accordingly while preserving the core contributions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claims of significant outperformance over direct prompting and successful translation of hypotheses into novel algorithms are asserted without any reported implementation details, evaluation metrics, baselines, controls, statistical tests, or quantitative results, rendering the claims impossible to assess.

    Authors: We agree the abstract presents the claims at a high level. The full manuscript (Sections 4–6) reports the experimental protocol, including LLM and human evaluation metrics for hypothesis quality, direct-prompting baselines, controls for prompt length and temperature, and statistical tests (e.g., paired t-tests) showing significant differences. The algorithm translations include complexity analysis and accuracy comparisons on standard benchmarks. We will revise the abstract to include concise quantitative summaries and pointers to these sections so the claims can be assessed from the abstract alone. revision: yes

  2. Referee: [Framework definition (abstract)] Framework definition (abstract): The assertion that the pipeline produces deductively valid hypotheses per Hempel's DN model (explanandum as logical consequence of laws plus initial conditions) is unsupported by the described LLM workflow of natural-language 'abstraction of universals', 'retrieval of laws', and 'deductive reconstruction'; no formal logic engine, theorem prover, or entailment check is indicated, allowing for non-entailed steps or invented premises that violate the DN requirement.

    Authors: The observation is accurate: the pipeline implements the DN structure through LLM-guided natural-language steps rather than a formal theorem prover or entailment verifier. We will revise the abstract and framework sections to state explicitly that the workflow operationalizes the DN model heuristically—structuring prompts to encourage deductive reconstruction—while acknowledging that it does not guarantee formal logical validity. This change clarifies the distinction between philosophical inspiration and formal proof without altering the reported empirical results. revision: yes

Circularity Check

0 steps flagged

No circularity: framework operationalizes external philosophical accounts into empirical workflow

full rationale

The paper's derivation chain consists of defining a hypothesis-generation pipeline that adopts Hempel's DN model for deductive form, Salmon's account for search constraints, and Armstrong's universals for bridging processes to laws, then applies this scaffold via LLMs to abstract universals, retrieve laws, and reconstruct explanations from an explanandum. Evaluation compares generated hypotheses against direct prompting baselines, with downstream translation to algorithms presented as empirical outcomes. No equations, fitted parameters, or self-citations appear in the provided text; the central claims rest on the operationalization of externally cited philosophical sources rather than any reduction of outputs to inputs by construction. The workflow is self-contained against external benchmarks (LLM and human expert judgments) without load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 1 invented entities

The central claim rests on the operationalization of three externally cited philosophical models into an LLM workflow; no free parameters are mentioned in the abstract.

axioms (3)
  • domain assumption Hempel's deductive-nomological model supplies the output form and deductive validity of a hypothesis
    Adopted as the scaffold for hypothesis output form.
  • domain assumption Salmon's causal-process account supplies an organizing constraint on where to search for the governing laws
    Used to direct the search space.
  • domain assumption Armstrong's view of laws as relations between universals supplies the bridge from a phenomenon's constituent processes to the laws
    Provides the mechanism linking processes to laws.
invented entities (1)
  • DN-Hypo-Pipeline no independent evidence
    purpose: Layered explanation-theoretic scaffold for LLM hypothesis generation
    Newly introduced workflow that combines the three philosophical accounts.

pith-pipeline@v0.9.1-grok · 5823 in / 1514 out tokens · 25167 ms · 2026-06-27T18:58:32.027765+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

116 extracted references · 28 canonical work pages

  1. [1]

    Additionally, Figs

    Experimental Results Analysis and Discussion The experiments show that not only can LLMs propose hypotheses, but also the best hypotheses generated by DN -Hypo-Pipeline improved the aggregate sum of scores by an average of more than 6 points (53-46.33=6.67) across a total of 80 scores (4 LLMs * total 20 scores), as outlined in Table 6. Additionally, Figs....

  2. [2]

    profound and self-consistent

    Limitations and Conclusions The limitations of our approach largely parallel the limitations of LLMs. When using an LLM to generate open-ended ontologies, such as when generating universals and law s, hallucinations can manifest in a particularly stubborn and intractable form. Hence, when a model is required to construct a complet e conceptual system from...

  3. [3]

    https://en.wikipedia.org/wiki/Scientific_method

    scientific method, (n.d.). https://en.wikipedia.org/wiki/Scientific_method

  4. [4]

    https://plato.stanford.edu/entries/scientific-method/

    scientific-method, (n.d.). https://plato.stanford.edu/entries/scientific-method/

  5. [5]

    Hempel, Philosophy of Natural Science, Prentice Hall, 1966

    C.G. Hempel, Philosophy of Natural Science, Prentice Hall, 1966

  6. [6]

    S. Ren, P. Jian, Z. Ren, C. Leng, C. Xie, J. Zhang, Towards Scientific Intelligence: A Survey of LLM-based Scientific Agents, (2025). http://arxiv.org/abs/2503.24047

  7. [7]

    Author contributions H.N

    J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, P. Sirkovic, A. Myaskovsky, G. Glowaty, F. Weissenberger, A. Orlandi, D. Popovici, A. Palepu, K. Rong, R. Tanno, K. Saab, F. Zhang, J. Blum, A. Carroll, K. Kulkarni, N. Tomašev, D. Zverinski, I. Rendulic, E. Vedadi, F. Hasler, L. Rimanic, M. Boia, I. Budiselic, B. Feinstein, M. Bellaiche, T. Sheffer, J. Freyberg,...

  8. [8]

    C. Lu, C. Lu, R.T. Lange, Y. Yamada, S. Hu, J. Foerster, D. Ha, J. Clune, Towards end -to-end automation of AI research, Nature 651 (2026) 914–919. https://doi.org/10.1038/s41586-026-10265-5

  9. [9]

    Z. Wang, B. Danek, Z. Yang, Z. Chen, J. Sun, Can Large Language Models Replace Data Scientists in Clinical Research?, Arxiv (2024) 1–28

  10. [10]

    Sprueill, C

    H.W. Sprueill, C. Edwards, K. Agarwal, M. V Olarte, U. Sanyal, C. Johnston, H. Liu, H. Ji, S. Choudhury, CHEMREASONER: heuristic search over a large language model ’s knowledge space using quantum-chemical feedback, in: Proc. 41st Int. Conf. Mach. Learn., JMLR.org, 2024

  11. [11]

    C. Cao, X. Cao, M. Cashman, M. Kumar, A. Timoshenko, J. Yang, S. Yu, J. Zhang, Y. Zhu, B. Wernerfelt, How do successful scholars get their best research ideas? An exploration, Mark. Lett. 30 (2019) 221–232. https://www.jstor.org/stable/48701541

  12. [12]

    Salmon, W.C

    W.C. Salmon, W.C. Salmon, Scientific Explanation and the Causal Structure of the World, Princeton University Press, Princeton, 2020. https://doi.org/doi:10.1515/9780691221489

  13. [13]

    Salmon, Causality and Explanation: A Reply to Two Critiques, Philos

    W.C. Salmon, Causality and Explanation: A Reply to Two Critiques, Philos. Sci. 64 (1997) 461–

  14. [14]

    http://www.jstor.org/stable/188320 (accessed June 23, 2026)

  15. [15]

    ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos

    D.M. ARMSTRONG, Laws of Nature As Relations Between Universals, and As Universals, Philos. Top. 13 (1982) 7–24. http://www.jstor.org/stable/43153907

  16. [16]

    H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V

    A. Karpatne, G. Atluri, J.H. Faghmous, M. Steinbach, A. Banerjee, A. Ganguly, S. Shekhar, N. Samatova, V. Kumar, Theory-Guided Data Science: A New Paradigm for Scientific Discovery from Data, IEEE Trans. Knowl. Data Eng. 29 (2017) 2318–2331. https://doi.org/10.1109/TKDE.2017.2720168

  17. [17]

    Ciucă, Y.-S

    I. Ciucă, Y.-S. Ting, S. Kruk, K. Iyer, Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy, (2023). http://arxiv.org/abs/2306.11648

  18. [18]

    O’Brien, J

    T. O’Brien, J. Stremmel, L. Pio-Lopez, P. McMillen, C. Rasmussen-Ivey, M. Levin, Machine learning for hypothesis generation in biology and medicine: exploring the latent space of neuroscience and developmental bioelectricity, Digit. Discov. 3 (2024) 249–263. https://doi.org/https://doi.org/10.1039/d3dd00185g

  19. [19]

    B. Qi, K. Zhang, K. Tian, H. Li, Z.-R. Chen, S. Zeng, E. Hua, H. Jinfang, B. Zhou, Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation, (2024). http://arxiv.org/abs/2407.08940

  20. [20]

    Radensky, S

    M. Radensky, S. Shahid, R. Fok, P. Siangliulue, T. Hope, D.S. Weld, Scideator: Human -LLM Compound System for Scientific Ideation through Facet Recombination and Novelty Evaluation, in: Proc. ACM Conf. AI Agentic Syst., Association for Computing Machinery, New York, NY, USA, 2026: pp. 348–374. https://doi.org/10.1145/3786335.3813161

  21. [21]

    doi:10.18653/v1/2025.naacl-long.342 , url=

    J. Baek, S.K. Jauhar, S. Cucerzan, S.J. Hwang, ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models, Proc. 2025 Annu. Conf. Nations Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. Long Pap. NAACL -HLT 2025 1 (2025) 6709–6738. https://doi.org/10.18653/v1/2025.naacl-long.342

  22. [22]

    O’Neill, T

    C. O’Neill, T. Ghosal, R. Răileanu, M. Walmsley, T. Bui, K. Schawinski, I. Ciucă, Sparks of Science: Hypothesis Generation Using Structured Paper Data, (2025). http://arxiv.org/abs/2504.12976

  23. [23]

    R. Li, L. Jing, C. Han, J. Zhou, X. Du, Learning to Generate Research Idea with Dynamic Control, (2024). http://arxiv.org/abs/2412.14626

  24. [24]

    Afonja, I

    T. Afonja, I. Sheth, R. Binkyte, W. Hanif, T. Ulas, M. Becker, M. Fritz, LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation, (2024). http://arxiv.org/abs/2410.15828

  25. [25]

    & Buehler, M

    A. Ghafarollahi, M.J. Buehler, SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning, Adv. Mater. 37 (2025) 2413523. https://doi.org/https://doi.org/10.1002/adma.202413523

  26. [26]

    Xiong, E

    G. Xiong, E. Xie, A.H. Shariatmadari, S. Guo, S. Bekiranov, A. Zhang, Imp roving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models, (2024). http://arxiv.org/abs/2411.02382

  27. [27]

    C. Si, D. Yang, T. Hashimoto, Can LLMs Generate Novel Research Ideas? A Large -Scale Human Study with 100+ NLP Researchers, in: Y. Yue, A. Garg, N. Peng, F. Sha, R. Yu (Eds.), Int. Conf. Learn. Represent., 2025: pp. 94003–94092. https://proceedings.iclr.cc/paper_files/paper/2025/file/ea94957d81b1c1caf87ef5319fa6b467 -Paper-Conference.pdf

  28. [28]

    Q. Wang, D. Downey, H. Ji, T. Hope, {S}ci{MON}: Scientific Inspiration Machines Optimized for Novelty, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 279–299. https://doi.org/10.18653/v1/2024.acl-long.18

  29. [29]

    Z. Yang, X. Du, J. Li, J. Zheng, S. Poria, E. Cambria, Large language models for automated open-domain scientific hypotheses discovery, in: Find. Assoc. Comput. Linguist. ACL 2024, 2024: pp. 13545–13565

  30. [30]

    Y. Pu, T. Lin, H. Chen, PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration, (2025). http://arxiv.org/abs/2505.15047

  31. [31]

    Y. Pu, T. Lin, H. Chen, Principle-Evolvable Scientific Discovery via Uncertainty Minimization, (2026). http://arxiv.org/abs/2602.06448

  32. [32]

    R. Vasu, C. Basu, B. Dalvi Mishra, C. Sarasua, P. Clark, A. Bernstein, {H}yp{ER}: Literature-grounded Hypothesis Generation and Distillation with Provenance, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Suzhou, China, 2025: pp. 25413–2543...

  33. [33]

    Z. Yang, W. Liu, B. Gao, T. Xie, Y. Li, W. Ouyang, S. Poria, E. Cambria, D. Zhou, {MOOSE}-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses, in: Thirteen. Int. Conf. Learn. Represent., 2025. https://openreview.net/forum?id=X9OfMNNepI

  34. [34]

    Y. Liu, Z. Yang, T. Xie, J. Ni, B. Gao, Y. Li, S. Tang, W. Ouyang, E. Cambria, D. Zhou, ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition, (2025). http://arxiv.org/abs/2503.21248

  35. [35]

    https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/

    Scientific Explanation, (n.d.). https://plato.stanford.edu/archives/win2019/entries/scientific-explanation/

  36. [36]

    Hempel, P

    C.G. Hempel, P. Oppenheim, Studies in the Logic of Explanation, Philos. Sci. 15 (1948) 135 –175. http://www.jstor.org/stable/185169 (accessed June 23, 2026)

  37. [37]

    S. Yao, D. Yu, J. Zhao, I. Shafran, T.L. Griffiths, Y. Cao, K. Narasimhan, Tree of Thoughts: Deliberate Problem Solving with Large Language Models, Adv. Neural Inf. Process. Syst. 36 (2023) 1–14

  38. [38]

    Cooper, How to write an original research paper (and get it published)., J

    I.D. Cooper, How to write an original research paper (and get it published)., J. Med. Libr. Assoc. 103 (2015) 67–68. https://doi.org/10.3163/1536-5050.103.2.001

  39. [39]

    Sollaci, M.G

    L.B. Sollaci, M.G. Pereira, The introduction, methods, results, and discussion (IMRAD) structure: a fifty -year survey., J. Med. Libr. Assoc. 92 (2004) 364–367

  40. [40]

    R. Arp, B. Smith, A.D. Spear, Building Ontologies with Basic Formal Ontology, The MIT Press,

  41. [41]

    http://www.jstor.org/stable/j.ctt17kk7vw

  42. [42]

    Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

    B. Smith, CLASSIFYING PROCESSES: AN ESSAY IN APPLIED ONTOLOGY., Ratio 25 (2012) 463 –

  43. [43]

    https://doi.org/10.1111/j.1467-9329.2012.00557.x

  44. [44]

    J. Li, H. Yu, X. Luo, Q. Liu, {COSIGN}: Contextual Facts Guided Generation for Knowledge Graph Completion, in: K. Duh, H. Gomez, S. Bethard (Eds.), Proc. 2024 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. (Volume 1 Long Pap., Association for Computational Linguistics, Mexico City, Mexico, 2024: pp. 1669–1682. https://doi.org/10.1865...

  45. [45]

    S. Toro, A. V Anagnostopoulos, S.M. Bello, K. Blumberg, R. Cameron, L. Carmody, A.D. Diehl, D.M. Dooley, W.D. Duncan, P. Fey, P. Gaudet, N.L. Harris, M.P. Joachimiak, L. Kiani, T. Lubiana, M.C. Munoz-Torres, S. O‘Neil, D. Osumi-Sutherland, A. Puig-Barbe, J.T. Reese, L. Reiser, S.M.C. Robb, T. Ruemping, J. Seager, E. Sid, R. Stefancsik, M. Weber, V. Wood, ...

  46. [46]

    J. Gu, X. Jiang, Z. Shi, H. Tan, X. Zhai, C. Xu, W. Li, Y. Shen, S. Ma, H. Liu, S. Wang, K. Zhang, Z. Lin, B. Zhang, L. Ni, W. Gao, Y. Wang, J. Guo, A survey on LLM-as-a-judge, Innov. 7 (2026) 101253. https://doi.org/https://doi.org/10.1016/j.xinn.2025.101253

  47. [47]

    D. Li, B. Jiang, L. Huang, A. Beigi, C. Zhao, Z. Tan, A. Bhattacharjee, Y. Jiang, C. Chen, T. Wu, K. Shu, L. Cheng, H. Liu, From Generation to Judgment: Opportunities and Challenges of {LLM}-as-a-judge, in: C. Christodoulopoulos, T. Chakraborty, C. Rose, V. Peng (Eds.), Proc. 2025 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Lin...

  48. [48]

    Z. Yue, H. Zeng, L. Shang, Y. Liu, Y. Zhang, D. Wang, Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments, in: L.-W. Ku, A. Martins, V. Srikumar (Eds.), Proc. 62nd Annu. Meet. Assoc. Comput. Linguist. (Volume 1 Long Pap., Association for Computational Linguistics, Bangkok, Thailand, 2024: pp. 10331–10343. https://doi.org/10.18653/v...

  49. [49]

    Crisan, B

    A. Crisan, B. Fiore-Gartland, M. Tory, Passing the Data Baton : A Retrospective Analysis on Data Science Work and Workers, IEEE Trans. Vis. Comput. Graph. 27 (2021) 1860 –1870. https://doi.org/10.1109/TVCG.2020.3030340

  50. [50]

    Giordano, M.D

    F.R. Giordano, M.D. Weir, A first course in mathematical modeling / Frank R. Giordano, Maurice D. Weir., Brooks/Cole Pub. Co., Monterey, CA, 1985

  51. [51]

    https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

    Glenn Ledder, Mathematics for the Life Sciences, Springer New York, NY, 2016. https://doi.org/https://doi.org/10.1007/978-1-4614-7276-6

  52. [52]

    https://en.wikipedia.org/wiki/Law_(principle)

    Law (principle), (n.d.). https://en.wikipedia.org/wiki/Law_(principle)

  53. [53]

    K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conf. Comput. Vis. Pattern Recognit., 2016: pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

  54. [54]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is All you Need, in: I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Adv. Neural Inf. Process. Syst., Curran Associates, Inc., 2017. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee24354...

  55. [55]

    Mikolov, K

    T. Mikolov, K. Chen, G.S. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space, in: Int. Conf. Learn. Represent., 2013. https://api.semanticscholar.org/CorpusID:5959482

  56. [56]

    https://doi.org/10.34740/KAGGLE/DSV/7548853

    arXiv.org submitters, arXiv Dataset, (2024). https://doi.org/10.34740/KAGGLE/DSV/7548853

  57. [57]

    Sampson, L

    M. Sampson, L. Zhang, A. Morrison, N.J. Barrowman, T.J. Clifford, R.W. Platt, T.P. Klassen, D. Moher, An alternative to the hand searching gold standard: validating methodological search filters using relative recall, BMC Med. Res. Methodol. 6 (2006) 33. https://doi.org/10.1186/1471-2288-6-33

  58. [58]

    https://en.wikipedia.org/wiki/Amdahl%27s_law

    Amdahl’s Law, (n.d.). https://en.wikipedia.org/wiki/Amdahl%27s_law

  59. [59]

    https://en.wikipedia.org/wiki/Zipf%27s_law

    Zipf’s law, (n.d.). https://en.wikipedia.org/wiki/Zipf%27s_law

  60. [60]

    Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl

    R.F. Woolson, Wilcoxon Signed-Rank Test, in: Wiley Encycl. Clin. Trials, John Wiley & Sons, Ltd, 2008: pp. 1–3. https://doi.org/https://doi.org/10.1002/9780471462422.eoct979

  61. [61]

    Mangiafico, Scheirer–Ray–Hare Test, in: Summ

    Salvatore S. Mangiafico, Scheirer–Ray–Hare Test, in: Summ. Anal. Ext. Progr. Eval. R, 2016. https://rcompanion.org/handbook/F_14.html

  62. [62]

    Nixon, A.S

    M.S. Nixon, A.S. Aguado, 12 - Distance, classification and learning, in: Featur. Extr. Image Process. Comput. Vis. (Fourth Ed., Fourth Edition, Academic Press, 2020: pp. 571 –604. https://doi.org/https://doi.org/10.1016/B978-0-12-814976-8.00012-9

  63. [63]

    Y.-H.H. Tsai, S. Bai, M. Yamada, L.-P. Morency, R. Salakhutdinov, Transformer Dissection: An Unified Understanding for Transformer{’}s Attention via the Lens of Kernel, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proc. 2019 Conf. Empir. Methods Nat. Lang. Process. 9th Int. Jt. Conf. Nat. Lang. Process., Association for Computational Linguistics, Hong Kon...

  64. [64]

    https://doi.org/10.18653/v1/D19-1443

  65. [65]

    Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J

    P.N. Swarztrauber, On Computing the Points and Weights for Gauss--Legendre Quadrature, SIAM J. Sci. Comput. 24 (2003) 945–954. https://doi.org/10.1137/S1064827500379690

  66. [66]

    Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998

    P.S. Heckbert, Fourier Transforms and the Fast Fourier Transform ( FFT ) Algorithm, in: 1998. https://api.semanticscholar.org/CorpusID:6022157

  67. [67]

    Katharopoulos, A

    A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: fast autoregressive transformers with linear attention, in: Proc. 37th Int. Conf. Mach. Learn., JMLR.org, 2020

  68. [68]

    Y. Chen, K. Ren, Y. Wang, Y. Fang, W. Sun, D. Li, ContiFormer: continuous -time transformer for irregular time series modeling, in: Proc. 37th Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2023

  69. [69]

    Baevski, M

    A. Baevski, M. Auli, Adaptive Input Representations for Neural Language Modeling, in: Int. Conf. Learn. Represent., 2019. https://openreview.net/forum?id=ByxZX20qFQ

  70. [70]

    Mikolov, I

    T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in: Proc. 27th Int. Conf. Neural Inf. Process. Syst. - Vol. 2, Curran Associates Inc., Red Hook, NY, USA, 2013: pp. 3111–3119

  71. [71]

    Pinter, R

    Y. Pinter, R. Guthrie, J. Eisenstein, Mimicking Word Embeddings using Subword {RNN}s, in: M. Palmer, R. Hwa, S. Riedel (Eds.), Proc. 2017 Conf. Empir. Methods Nat. Lang. Process., Association for Computational Linguistics, Copenhagen, Denmark, 2017: pp. 102–112. https://doi.org/10.18653/v1/D17-1010

  72. [72]

    R. Shu, H. Nakayama, Compressing Word Embeddings via Deep Compositional Code Learning, in: Int. Conf. Learn. Represent., 2018. https://openreview.net/forum?id=BJRZzFlRb

  73. [73]

    Bojanowski, E

    P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information., TACL 5 (2017) 135–146. http://dblp.uni-trier.de/db/journals/tacl/tacl5.html#BojanowskiGJM17

  74. [74]

    Svenstrup, J.M

    D. Svenstrup, J.M. Hansen, O. Winther, Hash embeddings for efficient word representations, in: Proc. 31st Int. Conf. Neural Inf. Process. Syst., Curran Associates Inc., Red Hook, NY, USA, 2017: pp. 4935–4943

  75. [75]

    Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang

    G.E. Karniadakis, I.G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, P hysics-informed machine learning, Nat. Rev. Phys. 3 (2021) 422–440. https://doi.org/10.1038/s42254-021-00314-5. Appendix A. Continuous-Time Attention Transformer A.1 Overall Architecture CTAT (Continuous-Time Attention Transformer) Models: Instead of using positional encoding and...

  76. [76]

    Dense S oftmax, which explicitly computes 𝑂(𝐿2) attention scores and adds a Gaussian distance kernel bias

  77. [77]

    Dense Linear (ELU+1) , which is a linear form of attention that explicitly computes the kernel-weighted sum in 𝑂(𝐿2)

  78. [78]

    FFT Linear, which uses the convolution theorem and a Fast Fourier Transform to reduce the complexity of the linear attention to 𝑂(𝐿 𝑙𝑜𝑔(𝐿))

  79. [79]

    word manifold

    Gauss-Legendre finite -interval approximation , which approximates the continuous -time integral using a fixed number of quadrature nodes, but is restricted to a learnable causal window, where 𝑂(𝐿𝑀),𝑀 is the number of interpolation nodes. A.2 Key Mathematical Definitions A.2.1 Gaussian Distance Kernel For positions 𝑖 and 𝑗(𝑗 ≤ 𝑖), the time distance is 𝑟 =...

  80. [80]

    [First step in the logical deduction connecting laws and conditions to the result.]

Showing first 80 references.