arxiv: 2605.04759 · v2 · submitted 2026-05-06 · 💻 cs.CL · cs.AI· cs.ET· cs.LG

Recognition: no theorem link

Gyan: An Explainable Neuro-Symbolic Language Model

Venkat Srinivasan , Vishaal Jatav , Anushka Chandrababu , Geetika Sharma

Authors on Pith no claims yet

Pith reviewed 2026-05-14 21:58 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.ETcs.LG

keywords explainable AIneuro-symbolic language modelsrhetorical structure theorysemantic role theoryhallucination preventionmission critical AInon-transformer architectures

0 comments

The pith

Gyan uses rhetorical structure and semantic role theory to build an explainable language model that reaches SOTA performance without hallucinations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Gyan as a non-transformer language model that separates language processing from knowledge representation. It draws on rhetorical structure theory and semantic role theory to form a meaning representation that captures full compositional context and builds toward a human-like world model. This design is intended to remove the hallucinations, opacity, and high compute costs typical of transformer models. The reported results show state-of-the-art scores on three public datasets and better results on two private ones, pointing to greater reliability in high-stakes settings. If the architecture works as described, it would allow language models to be trusted where transparency and consistency matter most.

Core claim

Gyan is an explainable neuro-symbolic language model based on a novel non-transformer architecture. The model draws on rhetorical structure theory, semantic role theory, and knowledge-based computational linguistics. Its meaning representation structure captures complete compositional context and expands the context to a world model. This decouples the language model from knowledge acquisition and representation, yielding SOTA performance on three widely cited datasets and superior performance on two proprietary datasets while avoiding hallucinations and opacity.

What carries the argument

The novel non-transformer architecture that applies rhetorical structure theory and semantic role theory to construct a meaning representation which expands context into a world model.

If this is right

Language models can be made trustable and reliable enough for mission-critical tasks.
Decoupling language rules from knowledge allows easier maintenance and updates.
Full compositional context capture reduces the need for enormous pre-training compute.
Models become interpretable by construction rather than through post-hoc explanations.
AI systems can expand context to a world model without relying on scale alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be extended to multimodal inputs by applying the same structural parsing to images or video descriptions.
If the world-model expansion holds, the model might handle ambiguous or incomplete inputs more gracefully than pure statistical systems.
Independent replication on open benchmarks would clarify whether the performance edge comes from the architecture or from dataset-specific tuning.

Load-bearing premise

The architecture based on rhetorical structure theory and semantic role theory actually captures complete compositional context and eliminates hallucinations by design.

What would settle it

A test set of factual questions known to produce hallucinations in current transformer models, run on Gyan with independent verification of every output for factual accuracy.

Figures

Figures reproduced from arXiv: 2605.04759 by Anushka Chandrababu, Geetika Sharma, Venkat Srinivasan, Vishaal Jatav.

**Figure 1.** Figure 1: High Level Architecture of Gyan [29]) is closer to models of human cognition where humans expand context both with episodic details and in an abstract fashion. Human understanding of language critically depends on understanding the complete composition and the inter-related semantic roles of different constituent parts in the full composition. In forming their understanding, humans also use their prior kno… view at source ↗

**Figure 2.** Figure 2: Knowledge Layers in Gyan from the internet are stored in data stores referred to as ’Knowledge Stores’ (KS), separate from the Gyan LLM. These are transparent stores of GMRs of the documents processed using Gyan. Pre-processed Knowledge Stores serve the same purpose in Gyan as pre-training in transformer LLMs but without the challenges associated with models based purely on word patterns. Asymptotically, t… view at source ↗

**Figure 3.** Figure 3: Gyan Meaning Representation Graph view at source ↗

**Figure 4.** Figure 4: Gyan 4.3 on MSMarco Passage Ranking view at source ↗

**Figure 6.** Figure 6: MMLU Leaderboard (May 12-2025) Using the KS for Medicine, we evaluated Gyan LM on all the MMLU variants. The detailed results are available from the authors on request. Gyan-4.4 performance is stable across all these data sets validating the robustness of the Gyan architecture to errors in data. Gyan-4.4 was relatively unaffected by purely representational edits like changes in the order of the options, in… view at source ↗

**Figure 7.** Figure 7: Ranking Results on the 20 Query Dataset [Google, MS Azure, Gyan] view at source ↗

**Figure 8.** Figure 8: Gyan Physical Architecture composition in its surface form and expand it to a world model to fully understand the composition in a human analogous context. It is closer to the Firth approach to distributional semantics rather than Harris. We demonstrated the relative efficacy of the architecture by comparing Gyan performance on 3 widely cited benchmark data sets. While there has been recent criticism of be… view at source ↗

read the original abstract

Transformer based pre-trained large language models have become ubiquitous. There is increasing evidence to suggest that even with large scale pre-training, these models do not capture complete compositional context and certainly not, the full human analogous context. Besides, by the very nature of the architecture, these models hallucinate, are difficult to maintain, are not easily interpretable and require enormous compute resources for training and inference. Here, we describe Gyan, an explainable language model based on a novel non-transformer architecture, without any of these limitations. Gyan achieves SOTA performance on 3 widely cited data sets and superior performance on two proprietary data sets. The novel architecture decouples the language model from knowledge acquisition and representation. The model draws on rhetorical structure theory, semantic role theory and knowledge-based computational linguistics. Gyan's meaning representation structure captures the complete compositional context and attempts to mimic humans by expanding the context to a 'world model'. AI model adoption critically depends on trust and transparency especially in mission critical use cases. Collectively, our results demonstrate that it is possible to create models which are trustable and reliable for mission critical tasks. We believe our work has tremendous potential for guiding the development of transparent and trusted architectures for language models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Gyan claims a non-transformer architecture using RST and semantic roles that fixes hallucinations and hits SOTA, but the text supplies no numbers, methods, or comparisons to support any of it.

read the letter

The paper introduces Gyan as a non-transformer model that draws on rhetorical structure theory and semantic role theory to decouple language modeling from knowledge representation. It builds a meaning structure meant to capture full compositional context and expand into a world model, with the goal of removing hallucinations, improving interpretability, and cutting compute while matching or beating current performance on standard tasks. That framing directly targets real weaknesses in transformer-based LLMs for high-stakes use, and the high-level idea of grounding models in explicit linguistic structures is a reasonable direction to explore. The paper does a clear job stating the problems with current approaches and why trust matters in mission-critical settings. The soft spots are more substantial. The abstract states SOTA results on three public datasets and superior results on two proprietary ones, plus zero hallucinations and lower resource use, yet it includes no metrics, baselines, tables, error bars, training details, or architecture diagrams. Without those elements it is impossible to check whether the claimed properties actually follow from the design or whether the numbers are real. The stress-test concern about unreported experiments holds up on the available text. This kind of proposal could interest researchers working on neuro-symbolic or explainable NLP who are already thinking about alternatives to pure scaling. A reader who needs concrete methods, reproducible results, or direct comparisons to prior work will not find enough to engage with. I would not bring it to a reading group and would not send it to peer review in this form, because there is no substantive technical content for referees to assess.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces Gyan, a novel non-transformer neuro-symbolic language model grounded in rhetorical structure theory and semantic role theory. It claims to decouple language modeling from knowledge acquisition, capture complete compositional context through an expanded 'world model', eliminate hallucinations, and deliver SOTA performance on three widely cited public datasets plus superior results on two proprietary datasets, while offering improved interpretability, maintainability, and efficiency compared to transformer-based LLMs.

Significance. If the performance and architectural claims are substantiated with rigorous experiments, the work could have substantial significance for the field by demonstrating a viable path toward trustworthy, explainable language models suitable for mission-critical applications. The neuro-symbolic decoupling and emphasis on human-like context modeling address core limitations of current LLMs, potentially influencing future directions in interpretable AI and reducing reliance on massive compute resources.

major comments (3)

[Abstract] Abstract: The assertions of SOTA performance on three public datasets and superior performance on two proprietary datasets are presented without any metrics, baselines, result tables, error bars, or evaluation protocols, which directly undermines the central empirical claims.
[Architecture] Architecture section: The novel non-transformer architecture is described conceptually via RST and semantic role theory but provides no equations, formal definitions, pseudocode, or implementation details showing how the meaning representation structure is built or how it achieves complete compositional context and a human-like world model.
[Results] Results/Experiments: No quantitative results, baseline comparisons, statistical tests, or hallucination evaluation protocols are supplied to support the superiority, reliability, or zero-hallucination claims, leaving the key assertions without verifiable grounding.

minor comments (2)

[Introduction] The manuscript would benefit from explicit citations to foundational RST and semantic role labeling literature to better situate the theoretical contributions.
[Abstract] Clarify the exact meaning of 'Gyan' and its connection to the model's design goals for improved readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important areas where the presentation of empirical support and formal details can be strengthened. We will undertake a major revision to address these points by adding the requested metrics, formalisms, and evaluation details. Our responses to each major comment are provided below.

read point-by-point responses

Referee: [Abstract] Abstract: The assertions of SOTA performance on three public datasets and superior performance on two proprietary datasets are presented without any metrics, baselines, result tables, error bars, or evaluation protocols, which directly undermines the central empirical claims.

Authors: We agree that the abstract would benefit from greater specificity. In the revised manuscript, we will expand the abstract to include key performance metrics (e.g., accuracy or F1 scores), named baselines, and a concise statement of the evaluation protocols used across the public and proprietary datasets. revision: yes
Referee: [Architecture] Architecture section: The novel non-transformer architecture is described conceptually via RST and semantic role theory but provides no equations, formal definitions, pseudocode, or implementation details showing how the meaning representation structure is built or how it achieves complete compositional context and a human-like world model.

Authors: The current version emphasizes the conceptual grounding in rhetorical structure theory and semantic role theory. We acknowledge that formal rigor is needed. In revision, we will add mathematical definitions of the meaning representation, equations describing context composition and world-model expansion, pseudocode for the core inference steps, and implementation-level details on how the neuro-symbolic decoupling is realized. revision: yes
Referee: [Results] Results/Experiments: No quantitative results, baseline comparisons, statistical tests, or hallucination evaluation protocols are supplied to support the superiority, reliability, or zero-hallucination claims, leaving the key assertions without verifiable grounding.

Authors: We accept that the results section requires substantial expansion to meet standards of empirical rigor. The revised manuscript will include full result tables with metrics and error bars, direct comparisons against published baselines, statistical significance tests, and a dedicated subsection describing the hallucination evaluation protocol (including dataset construction and scoring criteria) used to support the reliability claims. revision: yes

Circularity Check

0 steps flagged

No derivation chain present; circularity analysis inapplicable

full rationale

The paper supplies only conceptual assertions about a non-transformer architecture drawing on rhetorical structure theory and semantic role theory, with no equations, pseudocode, formal derivations, or internal logic steps shown anywhere in the text. Claims of SOTA performance, complete compositional context capture, hallucination elimination, and world-model expansion are stated without any supporting chain, baselines, or metrics. Because no derivation exists to reduce to its inputs, none of the enumerated circularity patterns apply, and the score is 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities are specified or can be extracted.

pith-pipeline@v0.9.0 · 5534 in / 1163 out tokens · 38680 ms · 2026-05-14T21:58:40.987369+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · 10 internal anchors

[1]

BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis,...

work page doi:10.18653/v1/n19-1423 2019
[2]

Deep contextualized word representations

Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Lo...

work page doi:10.18653/v1/n18-1202 2018
[3]

Improving language understand- ing by generative pre-training, 2018

Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. Improving language understand- ing by generative pre-training, 2018. URL https://www.cs.ubc.ca/~amuham01/LING530/papers/ radford2018improving.pdf

work page 2018
[4]

Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, ...

work page internal anchor Pith review Pith/arXiv arXiv 2022
[5]

arXiv preprint arXiv:2107.02137 , year=

Yu Sun, Shuohuan Wang, Shikun Feng, Siyu Ding, Chao Pang, Junyuan Shang, Jiaxiang Liu, Xuyi Chen, Yanbin Zhao, Yuxiang Lu, Weixin Liu, Zhihua Wu, Weibao Gong, Jianzhong Liang, Zhizhou Shang, Peng Sun, Wei Liu, Xuan Ouyang, Dianhai Yu, Hao Tian, Hua Wu, and Haifeng Wang. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and ...

work page arXiv 2021
[6]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. Llama: Open and efficient foundation language models, 2023. URL https: //arxiv.org/abs/2302.13971

work page internal anchor Pith review Pith/arXiv arXiv 2023
[7]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Harts...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[8]

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Anto...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[9]

Qwen Technical Report

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng X...

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

Manning, Christopher Potts, Christopher Ré, and Percy Liang

Siddharth* Karamcheti, Laurel* Orr, Jason Bolton, Tianyi Zhang, Karan Goel, Avanika Narayan, Rishi Bom- masani, Deepak Narayanan, Tatsunori Hashimoto, Dan Jurafsky, Christopher D. Manning, Christopher Potts, Christopher Ré, and Percy Liang. Mistral - a journey towards reproducible language model training, 2021. URL https://github.com/stanford-crfm/mistral

work page 2021
[11]

ISBN 978-1-4503-8309-7

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. On the dangers of stochastic parrots: Can language models be too big? InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, page 610–623, New York, NY , USA, 2021. Association for Computing Machinery. ISBN 9781450383097. doi: 10....

work page doi:10.1145/3442188.3445922 2021
[12]

Nonsense on stilts, 2022

Gary Marcus. Nonsense on stilts, 2022. URL https://garymarcus.substack.com/p/ nonsense-on-stilts

work page 2022
[13]

Llms: Dishonest, unpredictable and potentially dangerous, 2025

Gary Marcus. Llms: Dishonest, unpredictable and potentially dangerous, 2025. URL https://garymarcus. substack.com/p/llms-dishonest-unpredictable-and

work page 2025
[14]

Noah A. Smith. Contextual word representations: A contextual introduction, 2020. URL https://arxiv.org/ abs/1902.06006

work page arXiv 2020
[15]

Testing ai on language comprehension tasks reveals insensitivity to underlying meaning.Scientific Reports, 14, 11 2024

Vittoria Dentella, Fritz Günther, Elliot Murphy, Gary Marcus, and Evelina Leivada. Testing ai on language comprehension tasks reveals insensitivity to underlying meaning.Scientific Reports, 14, 11 2024. doi: 10.1038/ s41598-024-79531-8. 11 Gyan: An Explainable Neuro-Symbolic Language ModelA PREPRINT

work page 2024
[16]

From tokens to thoughts: How llms and humans trade compression for meaning, 2025

Chen Shani, Liron Soffer, Dan Jurafsky, Yann LeCun, and Ravid Shwartz-Ziv. From tokens to thoughts: How llms and humans trade compression for meaning, 2025. URLhttps://arxiv.org/abs/2505.17117

work page arXiv 2025
[17]

OpenAI, Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, Alex Iftimie, Alex Karpenko, Alex Tachard Passos, Alexander Neitz, Alexander Prokofiev, Alexander Wei, Allison Tam, Ally Bennett, Ananya Kumar, Andre Saraiva, Andrea Vallone, Andrew Duberstein, Andrew Kondrich,...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[18]

Introducing openai o1, 2024

OpenAI. Introducing openai o1, 2024. URLhttps://openai.com/o1/

work page 2024
[19]

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Deli Chen, Dongjie Ji, ...

work page doi:10.1038/s41586-025-09422-z 2025
[20]

Claude 3.7 sonnet’s extended thinking, 2025

Anthropic. Claude 3.7 sonnet’s extended thinking, 2025. URL https://www.anthropic.com/news/ visible-extended-thinking

work page 2025
[21]

Gemini thinking, 2025

Google AI. Gemini thinking, 2025. URLhttps://ai.google.dev/gemini-api/docs/thinking

work page 2025
[22]

Dissociating language and thought in large language models: a cognitive perspective, 01 2023

Kyle Mahowald, Anna Ivanova, Idan Blank, Nancy Kanwisher, Joshua Tenenbaum, and Evelina Fedorenko. Dissociating language and thought in large language models: a cognitive perspective, 01 2023

work page 2023
[23]

Hello, multimodal hallucinations, 2023

Gary Marcus and Ernest Davis. Hello, multimodal hallucinations, 2023. URLhttps://garymarcus.substack. com/p/hello-multimodal-hallucinations

work page 2023
[24]

What company do words keep? revisiting the distributional semantics of J.R

Mikael Brunila and Jack LaViolette. What company do words keep? revisiting the distributional semantics of J.R. firth & zellig Harris. InProceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4403–4417, Seattle, United States, July 2022. Association for Computat...

work page doi:10.18653/v1/2022.naacl-main.327 2022
[25]

Zellig S. Harris. Distributional structure.WORD, 10(2-3):146–162, 1954. doi: 10.1080/00437956.1954.11659520. URLhttps://doi.org/10.1080/00437956.1954.11659520

work page doi:10.1080/00437956.1954.11659520 1954
[26]

Harris.Language and Information

Zellig S. Harris.Language and Information. Columbia University Press, New York, 1988. ISBN 9780231066624

work page 1988
[27]

J. R. Firth. A synopsis of linguistic theory, 1930-1955.Studies in Linguistic Analysis, 1957

work page 1930
[28]

J. R. Firth. Modes of meaning.Papers in Linguistics, 1934-1951, pages 190–215, 1957

work page 1934
[29]

J. R. Firth. Modes of meaning.Studies in Linguistic Analysis, 1957

work page 1957
[30]

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. Ms marco: A human generated machine reading comprehension dataset, 2018. URL https://arxiv.org/abs/ 1611.09268

work page internal anchor Pith review Pith/arXiv arXiv 2018
[31]

Cohen, and Xinghua Lu

Qiao Jin, Bhuwan Dhingra, Zhengping Liu, William W. Cohen, and Xinghua Lu. Pubmedqa: A dataset for biomedical research question answering, 2019. URLhttps://arxiv.org/abs/1909.06146

work page arXiv 2019
[32]

Measuring Massive Multitask Language Understanding

Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding, 2021. URLhttps://arxiv.org/abs/2009.03300

work page internal anchor Pith review Pith/arXiv arXiv 2021
[33]

Overcoming catastrophic forgetting in neural networks.Proceedings of the National Academy of Sciences, 114(13): 3521–3526, 2017

James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, and Raia Hadsell. Overcoming catastrophic forgetting in neural networks.Proceedings of the National Academy of Sciences, 114(13):3521–...

work page doi:10.1073/pnas.1611835114 2017
[34]

Culture and cognition.Annual Review of Sociology, 23(V olume 23, 1997):263–287, 1997

Paul DiMaggio. Culture and cognition.Annual Review of Sociology, 23(V olume 23, 1997):263–287, 1997. ISSN 1545-2115. doi: https://doi.org/10.1146/annurev.soc.23.1.263. URL https://www.annualreviews. org/content/journals/10.1146/annurev.soc.23.1.263

work page doi:10.1146/annurev.soc.23.1.263 1997
[35]

Observing the transformation of experience into memory.Trends in cognitive sciences, 6:93–102, 03 2002

Ken Paller and Anthony Wagner. Observing the transformation of experience into memory.Trends in cognitive sciences, 6:93–102, 03 2002. doi: 10.1016/S1364-6613(00)01845-3

work page doi:10.1016/s1364-6613(00)01845-3 2002
[36]

The consolidation and transformation of memory.Neuron, 88:20–32, 10

Yadin Dudai, Avi Karni, and Jan Born. The consolidation and transformation of memory.Neuron, 88:20–32, 10

work page
[37]

doi: 10.1016/j.neuron.2015.09.004

work page doi:10.1016/j.neuron.2015.09.004 2015
[38]

Blackwell, Malden, MA, 2005

Paul Portner.What is Meaning?: Fundamentals of Formal Semantics. Blackwell, Malden, MA, 2005

work page 2005
[39]

Adaptive Joint Learning of Compositional and Non-Compositional Phrase Embeddings

Kazuma Hashimoto and Yoshimasa Tsuruoka. Adaptive joint learning of compositional and non-compositional phrase embeddings, 2016. URLhttps://arxiv.org/abs/1603.06067

work page internal anchor Pith review Pith/arXiv arXiv 2016
[40]

Embeddings for word sense disambiguation: An evaluation study

Ignacio Iacobacci, Mohammad Taher Pilehvar, and Roberto Navigli. Embeddings for word sense disambiguation: An evaluation study. InProceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 897–907, Berlin, Germany, August 2016. Association for Computational Linguistics. doi: 10.18653/v1/P16-1085....

work page doi:10.18653/v1/p16-1085 2016
[41]

Embedding words and senses together via joint knowledge-enhanced training

Massimiliano Mancini, Jose Camacho-Collados, Ignacio Iacobacci, and Roberto Navigli. Embedding words and senses together via joint knowledge-enhanced training. InProceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 100–111, Vancouver, Canada, August 2017. Association for Computational Linguistics. doi: 10.18653...

work page doi:10.18653/v1/k17-1012 2017
[42]

Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A

Matthew E. Peters, Mark Neumann, Robert Logan, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A. Smith. Knowledge enhanced contextual word representations. InProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 43–54, Hong K...

work page doi:10.18653/v1/d19-1005 2019
[43]

Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model, 2019

Wenhan Xiong, Jingfei Du, William Yang Wang, and Veselin Stoyanov. Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model, 2019. URLhttps://arxiv.org/abs/1912.09637

work page arXiv 2019
[44]

KEPLER: A unified model for knowledge embedding and pre-trained language representation.Transactions of the Association for Computational Linguistics, 9:176–194, 2021

Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. KEPLER: A unified model for knowledge embedding and pre-trained language representation.Transactions of the Association for Computational Linguistics, 9:176–194, 2021. doi: 10.1162/tacl_a_00360. URL https://aclanthology. org/2021.tacl-1.11/

work page doi:10.1162/tacl_a_00360 2021
[45]

Improving lexical embeddings with semantic knowledge

Mo Yu and Mark Dredze. Improving lexical embeddings with semantic knowledge. InProceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 545–550, Baltimore, Maryland, June 2014. Association for Computational Linguistics. doi: 10.3115/v1/P14-2089. URL https://aclanthology.org/P14-2089/

work page doi:10.3115/v1/p14-2089 2014
[46]

The role of syntax in vector space models of compositional semantics

Karl Moritz Hermann and Phil Blunsom. The role of syntax in vector space models of compositional semantics. InProceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 894–904, Sofia, Bulgaria, August 2013. Association for Computational Linguistics. URL https: //aclanthology.org/P13-1088/

work page 2013
[47]

Rhetorical structure theory: A framework for the analysis of texts.Papers in Pragmatics; Vol 1, No 1 (1987), 1, 01 1987

William Mann and Sandra Thompson. Rhetorical structure theory: A framework for the analysis of texts.Papers in Pragmatics; Vol 1, No 1 (1987), 1, 01 1987. doi: 10.1075/iprapip.1.1.03tho

work page doi:10.1075/iprapip.1.1.03tho 1987
[48]

Maite Taboada and William C. Mann. Rhetorical structure theory: looking back and moving ahead.Discourse Studies, 8:423 – 459, 2006. URLhttps://api.semanticscholar.org/CorpusID:2386531

work page 2006
[49]

Text summarization using abstract meaning representation,

Shibhansh Dohare, Harish Karnick, and Vivek Gupta. Text summarization using abstract meaning representation,

work page
[50]

URLhttps://arxiv.org/abs/1706.01678

work page internal anchor Pith review Pith/arXiv arXiv
[51]

Cohen, Xiaohui Yan, and Yi Chang

Fuad Issa, Marco Damonte, Shay B. Cohen, Xiaohui Yan, and Yi Chang. Abstract Meaning Representation for paraphrase detection. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 442–452, New Orleans, Louisiana, June 2018. Association...

work page doi:10.18653/v1/n18-1041 2018
[52]

Abstract Meaning Representation for sembanking

Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. Abstract Meaning Representation for sembanking. InProceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, pages 178–186, Sofia, Bulgaria, August 2013. Association fo...

work page 2013
[53]

Towards a definition of knowledge graphs, 09 2016

Lisa Ehrlinger and Wolfram Wöß. Towards a definition of knowledge graphs, 09 2016

work page 2016
[54]

Exploiting linked data and knowledge graphs in large organisations, 01 2017

Jeff Pan, Guido Vetere, Jose Manuel Gomez-Perez, and Honghan Wu. Exploiting linked data and knowledge graphs in large organisations, 01 2017

work page 2017
[55]

George A. Miller. Wordnet: a lexical database for english.Commun. ACM, 38(11):39–41, November 1995. ISSN 0001-0782. doi: 10.1145/219717.219748. URLhttps://doi.org/10.1145/219717.219748

work page doi:10.1145/219717.219748 1995
[56]

Sara} Mahdavi, {Joelle K.} Barral, {Dale R.} Webster, {Greg S.} Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, and Vivek Natarajan

Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Mohamed Amin, Le Hou, Kevin Clark, {Stephen R.} Pfohl, Heather Cole-Lewis, Darlene Neal, {Qazi Mamunur} Rashid, Mike Schaekermann, Amy Wang, Dev Dash, {Jonathan H.} Chen, {Nigam H.} Shah, Sami Lachgar, {Philip Andrew} Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise {Agüe...

work page doi:10.1038/s41591-024-03423-7 2025
[57]

Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F

Neil C. Thompson, Kristjan Greenewald, Keeheon Lee, and Gabriel F. Manso. Deep learning’s diminishing returns: The cost of improvement is becoming unsustainable.IEEE Spectrum, 58(10):50–55, 2021. doi: 10.1109/MSPEC.2021.9563954. 14 Gyan: An Explainable Neuro-Symbolic Language ModelA PREPRINT A A Formal Description of the Gyan Language Model Gyan is a Hier...

work page doi:10.1109/mspec.2021.9563954 2021
[58]

Starts with an input concept and a corpus

work page
[59]

processes the documents in the corpus using the Meaning Encoder,

work page
[60]

finds all the occurrences of the input concept,

work page
[61]

classify each of the occurrences of these concepts into whether they are the dominant topics from the corresponding Discourse Units; a process we term as relevance determination

work page
[62]

extracts the relation along with the context in which the relations always apply

work page
[63]

determination of whether the context is generic enough to occur in other cases

work page
[64]

This method results in a Knowledge Net that is grounded on the provided corpus

saves the relations in the knowledge net. This method results in a Knowledge Net that is grounded on the provided corpus. The corpus can be licensed content or freely available free-for-commercial-use content or content accumulated by an Enterprise over several years of operations. This way, an Enterprise Knowledge Net could be constructed in a manner suc...

work page
[65]

Build a knowledge net for a vocabulary of terms

work page
[66]

Discover vocabulary from a corpus of documents and iteratively build a knowledge net

work page
[67]

Build a knowledge net from a series of provided dictionaries

work page
[68]

Incrementally, add discover and add a knowledge for a new concept to an existing knowledge net

work page
[69]

Add knowledge from a corpus of documents to an existing knowledge net

work page
[70]

Add a knowledge net to an existing knowledge net Some of the common utilities within the Discoverer , following components can make the task of creating knowledge nets easier: 1.DiscoverV ocabularyutility to identify important concepts from a corpus of documents. 2.AggregateCorpus utility helps aggregate a corpus for a given concept from the web, by firin...

work page
[71]

V ocabulary was seeded from WordNet, KBPedia, ConceptNet, Cambridge English Dictionary and Oxford English Dictionary

work page
[72]

Concepts in the V ocabulary from various sources were disambiguated and linked to ensure that the senses are maintained

work page
[73]

Relations from KBPedia and ConceptNet were imported into the Base Knowledge Net

work page
[74]

hypernymy)

Definitions were discovered for the concepts and were processed using the Meaning Encoder to add additional fundamental knowledge (e.g. hypernymy)

work page
[75]

A generic HTML Discourse Model was used to parse and pre-process all the contents into the documents

Wikipedia and Encyclopedia were sourced and converted into a corpus of HTML documents. A generic HTML Discourse Model was used to parse and pre-process all the contents into the documents

work page
[76]

D Knowledge Stores Section 1.3.2 and 1.3.3 defines the fundamental difference between a Knowledge Net and Knowledge Store

Meaning Encoder was run on all the documents in the corpus and for every term in the vocabulary, relations were extracted from the relevant discourse units and saved in the Base Knowledge Net. D Knowledge Stores Section 1.3.2 and 1.3.3 defines the fundamental difference between a Knowledge Net and Knowledge Store. Knowledge Stores are repository of Meanin...

work page
[77]

Seeded the LS-KS with the Base Knowledge Net

work page
[78]

Curated a vocabulary of important terms from the various sub-fields of life sciences

work page
[79]

Curated a list of encyclopedias and credible reference information for the various fields of Life Sciences. E.g. PubChem, Cleveland Clinic, etc

work page
[80]

Curated a list of all the research papers (full or abstract) for the various fields of Life Sciences. E.g. PubMed, Science Direct, etc

work page

Showing first 80 references.