Recognition: 2 theorem links
· Lean TheoremLanguage Models as Knowledge Bases?
Pith reviewed 2026-05-16 10:48 UTC · model grok-4.3
The pith
Pretrained language models already contain relational knowledge that can be accessed through fill-in-the-blank queries without any fine-tuning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge. BERT also does remarkably well on open-domain question answering against a supervised baseline. Certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The ability to recall facts through cloze statements demonstrates the potential of these models as unsupervised open-domain QA systems.
What carries the argument
Cloze-statement probing, in which known facts are turned into fill-in-the-blank sentences to measure how often the model recalls the correct object without any task-specific training.
If this is right
- Language models could serve directly as open-domain QA systems with no additional supervision or schema design.
- Querying an open class of relations becomes possible without engineering a fixed knowledge-base structure.
- Extending the knowledge store requires only additional text data rather than manual curation.
- Some factual domains are captured more reliably than others during ordinary pretraining, guiding where extra effort may be needed.
Where Pith is reading between the lines
- If cloze performance tracks genuine knowledge, then larger models or more diverse text should improve recall on the weaker categories.
- The same probing method could be used to audit what biases or gaps exist in the knowledge absorbed from public web text.
- Testing whether performance holds when questions are asked in conversational form rather than fixed cloze templates would clarify the practical reach of this capability.
Load-bearing premise
That success on cloze statements directly reflects stored factual knowledge rather than surface patterns or repeated co-occurrences in the training text.
What would settle it
A controlled experiment in which the model succeeds on the original cloze statements but fails when the same facts are rephrased into new sentences that preserve meaning but avoid exact training-data wording.
read the original abstract
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that pretrained language models such as BERT encode relational knowledge extractable via cloze prompts without fine-tuning. On the T-REx benchmark it reports competitive precision@1 against traditional NLP pipelines that receive some oracle knowledge; it also shows competitive results on open-domain QA against a supervised baseline and notes that some relation types are learned more readily than others during standard pretraining. Reproducible code is released.
Significance. If the results hold, the work indicates that large LMs can function as flexible, unsupervised knowledge bases, removing the need for schema engineering and human supervision. This has implications for open-domain QA and knowledge extraction pipelines. The explicit release of reproduction code is a clear strength that supports verification and extension.
major comments (2)
- [T-REx evaluation] T-REx evaluation section: the central claim that cloze-statement precision@1 measures internalized relational knowledge (rather than memorized subject-object co-occurrences from Wikipedia pretraining) is not supported by any ablation that severs surface statistics while preserving the underlying facts (e.g., entity swapping or context randomization). The reported prompt sensitivity and per-relation variation do not resolve this distinction and directly affect the competitiveness interpretation.
- [Baseline comparison] Baseline comparison (around the main results table): the exact nature of 'oracle knowledge' granted to the traditional NLP methods is not defined with sufficient precision to guarantee a fair comparison; without this, the claim that BERT is competitive cannot be fully evaluated.
minor comments (2)
- [Abstract] Abstract: the claim of evaluating 'a wide range of state-of-the-art pretrained language models' is not matched by the depth of results, which focus primarily on BERT; a clarifying sentence would improve accuracy.
- [Figures/Tables] Figure and table captions: several lack explicit statements of what the y-axis or metric represents; this reduces immediate readability.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the scope and limitations of our evaluation. We address each major point below, providing additional context from the manuscript and indicating where revisions will be made to improve precision.
read point-by-point responses
-
Referee: [T-REx evaluation] T-REx evaluation section: the central claim that cloze-statement precision@1 measures internalized relational knowledge (rather than memorized subject-object co-occurrences from Wikipedia pretraining) is not supported by any ablation that severs surface statistics while preserving the underlying facts (e.g., entity swapping or context randomization). The reported prompt sensitivity and per-relation variation do not resolve this distinction and directly affect the competitiveness interpretation.
Authors: We agree that an explicit ablation such as entity swapping would provide stronger evidence for distinguishing relational knowledge from surface co-occurrence statistics. Our current analysis relies on prompt sensitivity (different templates for the same fact yielding different P@1 scores) and large per-relation variance to argue against pure memorization of subject-object pairs. While these observations are consistent with the model learning relational patterns rather than rote co-occurrences, they do not fully rule out the confound raised. We will add an explicit limitations paragraph acknowledging this gap and will include a small-scale entity-swapping experiment in the revised manuscript to directly test the claim. revision: partial
-
Referee: [Baseline comparison] Baseline comparison (around the main results table): the exact nature of 'oracle knowledge' granted to the traditional NLP methods is not defined with sufficient precision to guarantee a fair comparison; without this, the claim that BERT is competitive cannot be fully evaluated.
Authors: We appreciate the request for greater precision. In the manuscript, the traditional pipelines receive gold subject and object entity spans plus the relation type (i.e., they are given the entities and asked only to predict the relation label), which constitutes the 'oracle' information. We will revise the baseline description section and the caption of the main results table to state this explicitly, including the exact inputs provided to each system (e.g., whether entity linking or coreference resolution is bypassed). This clarification will allow readers to assess the comparison directly. revision: yes
Circularity Check
No circularity: empirical probing of held-out facts against external baselines
full rationale
The paper reports direct accuracy measurements of pretrained LMs on cloze templates drawn from the T-REx dataset (held-out facts) and compares them to independent supervised NLP baselines that have oracle access. No equations, fitted parameters, or derivations are presented; the central claim is an empirical observation rather than a reduction to self-defined quantities or self-citation chains. The evaluation protocol uses external benchmarks and does not rename or smuggle in prior results as new predictions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cloze-statement accuracy on held-out facts measures factual knowledge stored during pretraining
Forward citations
Cited by 18 Pith papers
-
REALM: Retrieval-Augmented Language Model Pre-Training
REALM augments language-model pre-training with an unsupervised retriever over Wikipedia documents and reports 4-16% absolute gains on open-domain QA benchmarks over prior implicit and explicit knowledge methods.
-
BOOKMARKS: Efficient Active Storyline Memory for Role-playing
BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.
-
Privacy Without Losing Place: A Paradigm for Private Retrieval in Spatial RAGs
PAS encodes locations via relative anchors and bins to deliver roughly 370-400m adversarial error in spatial RAG while retaining over half the baseline retrieval performance and keeping generation quality robust.
-
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
-
Graph Topology Information Enhanced Heterogeneous Graph Representation Learning
ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.
-
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.
-
Towards Understanding Continual Factual Knowledge Acquisition of Language Models: From Theory to Algorithm
Theoretical analysis of continual factual knowledge acquisition shows data replay stabilizes pretrained knowledge by shifting convergence dynamics while regularization only slows forgetting, leading to the STOC method...
-
TLoRA: Task-aware Low Rank Adaptation of Large Language Models
TLoRA jointly optimizes LoRA initialization via task-data SVD and sensitivity-driven rank allocation, delivering stronger results than standard LoRA across NLU, reasoning, math, code, and chat tasks while using fewer ...
-
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
RAPTOR introduces a tree-organized retrieval method using recursive abstractive summaries, achieving a 20% absolute accuracy improvement on the QuALITY benchmark when paired with GPT-4.
-
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Self-RAG trains LLMs to adaptively retrieve passages on demand and self-critique using reflection tokens, outperforming ChatGPT and retrieval-augmented Llama2 on QA, reasoning, and fact verification.
-
Inner Monologue: Embodied Reasoning through Planning with Language Models
LLMs form an inner monologue from closed-loop language feedback to improve high-level instruction completion in simulated and real robotic rearrangement and kitchen manipulation tasks.
-
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
MRKL is a modular neuro-symbolic architecture that integrates LLMs with external knowledge and discrete reasoning to overcome limitations of pure neural language models.
-
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
-
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Fine-tuned language models store knowledge in parameters to answer questions competitively with retrieval-based open-domain QA systems.
-
Structural Ranking of the Cognitive Plausibility of Computational Models of Analogy and Metaphors with the Minimal Cognitive Grid
A formalized Minimal Cognitive Grid ranks computational models of analogy and metaphor by alignment with cognitive theories using Functional/Structural Ratio, Generality, and Performance Match dimensions.
-
Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations
CRVA-TGRAG combines parent-document segmentation, ensemble retrieval, and teacher-guided fine-tuning to mitigate knowledge conflicts and improve accuracy in LLM-based CVE vulnerability analysis.
-
ARIA: Adaptive Retrieval Intelligence Assistant -- A Multimodal RAG Framework for Domain-Specific Engineering Education
ARIA is a multimodal RAG framework that filters domain-specific questions with 97.5% accuracy and outperforms ChatGPT-5 on pedagogical quality for a university civil engineering course.
-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.
Reference graph
Works this paper leans on
- [1]
-
[2]
Learning and Evaluating General Linguistic Intelligence
Learning and. arXiv:1901.11373 [cs, stat] , author =. 2019 , note =
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[3]
Lample, Guillaume and Ott, Myle and Conneau, Alexis and Denoyer, Ludovic and Ranzato, Marc'Aurelio , month = nov, year =. Phrase-. Proceedings of the 2018
work page 2018
-
[4]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
arXiv:1810.04805 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[5]
Transactions of the Association of Computational Linguistics , author =
Natural. Transactions of the Association of Computational Linguistics , author =
-
[6]
Rajpurkar, Pranav and Jia, Robin and Liang, Percy , month = jul, year =. Know. Proceedings of the 56th
-
[7]
Rajpurkar, Pranav and Zhang, Jian and Lopyrev, Konstantin and Liang, Percy , month = nov, year =. Proceedings of the 2016
work page 2016
-
[8]
arXiv:1703.04730 [cs, stat] , author =
Understanding. arXiv:1703.04730 [cs, stat] , author =. 2017 , note =
-
[9]
Relational inductive biases, deep learning, and graph networks
Relational inductive biases, deep learning, and graph networks , url =. arXiv:1806.01261 [cs, stat] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[10]
Grounding. arXiv:1807.09685 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Women also Snowboard: Overcoming Bias in Captioning Models
Women also. arXiv:1803.09797 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[12]
Distributional vectors encode referential attributes , url =
Gupta, Abhijeet and Boleda, Gemma and Baroni, Marco and Padó, Sebastian , month = sep, year =. Distributional vectors encode referential attributes , url =. Proceedings of the 2015
work page 2015
-
[13]
Gupta, Abhijeet and Boleda, Gemma and Padó, Sebastian , month = aug, year =. Distributed. Proceedings of the 6th
-
[14]
Phrase-Based & Neural Unsupervised Machine Translation
Phrase-. arXiv:1804.07755 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-. arXiv:1703.03400 [cs] , author =. 2017 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[16]
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
arXiv:1802.05098 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[17]
Proceedings of the 2015 conference on empirical methods in natural language processing , author =
Question-answer driven semantic role labeling:. Proceedings of the 2015 conference on empirical methods in natural language processing , author =. 2015 , pages =
work page 2015
-
[18]
Improving Entity Linking by Modeling Latent Relations between Mentions
Improving. arXiv preprint arXiv:1804.10637 , author =
work page internal anchor Pith review Pith/arXiv arXiv
-
[19]
and Davis, Ernest and Morgenstern, Leora , year =
Levesque, Hector J. and Davis, Ernest and Morgenstern, Leora , year =. The. Aaai spring symposium:
-
[20]
Easy victories and uphill battles in coreference resolution , booktitle =
Durrett, Greg and Klein, Dan , year =. Easy victories and uphill battles in coreference resolution , booktitle =
-
[21]
Advances in neural information processing systems , author =
Data programming:. Advances in neural information processing systems , author =. 2016 , pages =
work page 2016
-
[22]
Modeling relations and their mentions without labeled text , booktitle =
Riedel, Sebastian and Yao, Limin and McCallum, Andrew , year =. Modeling relations and their mentions without labeled text , booktitle =
-
[23]
Distant supervision for relation extraction without labeled data , booktitle =
Mintz, Mike and Bills, Steven and Snow, Rion and Jurafsky, Dan , year =. Distant supervision for relation extraction without labeled data , booktitle =
-
[24]
Neural relation extraction with selective attention over instances , volume =
Lin, Yankai and Shen, Shiqi and Liu, Zhiyuan and Luan, Huanbo and Sun, Maosong , year =. Neural relation extraction with selective attention over instances , volume =. Proceedings of the 54th
-
[25]
Teaching machines to read and comprehend , booktitle =
Hermann, Karl Moritz and Kocisky, Tomas and Grefenstette, Edward and Espeholt, Lasse and Kay, Will and Suleyman, Mustafa and Blunsom, Phil , year =. Teaching machines to read and comprehend , booktitle =
-
[26]
Carlson, Andrew and Betteridge, Justin and Kisiel, Bryan and Settles, Burr and Hruschka Jr, Estevam R. and Mitchell, Tom M. , year =. Toward an architecture for never-ending language learning. , volume =
-
[27]
and Soderland, Stephen and Broadhead, Matthew and Etzioni, Oren , year =
Banko, Michele and Cafarella, Michael J. and Soderland, Stephen and Broadhead, Matthew and Etzioni, Oren , year =. Open information extraction from the web. , volume =
-
[28]
Identification and verification of simple claims about statistical properties , booktitle =
Vlachos, Andreas and Riedel, Sebastian , year =. Identification and verification of simple claims about statistical properties , booktitle =
-
[29]
Vlachos, Andreas and Riedel, Sebastian , year =. Fact checking:. Proceedings of the
-
[30]
Annals of the New York Academy of Sciences , author =
Knowledge. Annals of the New York Academy of Sciences , author =. 1984 , pages =. doi:10.1111/j.1749-6632.1984.tb16513.x , language =
-
[31]
Rao, Sudha and Daume III, Hal , month = jul, year =. Learning to. Proceedings of the 56th
-
[32]
Transactions of the Association for Computational Linguistics , author =
Constructing. Transactions of the Association for Computational Linguistics , author =. 2018 , pages =
work page 2018
-
[33]
Strategies for natural language processing , author =
Beyond question answering , volume =. Strategies for natural language processing , author =
- [34]
-
[35]
Serban, Iulian Vlad and Sordoni, Alessandro and Bengio, Yoshua and Courville, Aaron C. and Pineau, Joelle , year =. Building
- [36]
-
[37]
Choi, Eunsol and He, He and Iyyer, Mohit and Yatskar, Mark and Yih, Wen-tau and Choi, Yejin and Liang, Percy and Zettlemoyer, Luke , month = aug, year =
-
[38]
Li, Jiwei and Monroe, Will and Ritter, Alan and Jurafsky, Dan and Galley, Michel and Gao, Jianfeng , month = nov, year =. Deep. Proceedings of the 2016
work page 2016
-
[39]
Ritter, Alan and Cherry, Colin and Dolan, William B. , month = jul, year =. Data-. Proceedings of the 2011
work page 2011
-
[40]
Das, Abhishek and Kottur, Satwik and Gupta, Khushi and Singh, Avi and Yadav, Deshraj and Moura, José MF and Parikh, Devi and Batra, Dhruv , year =. Visual dialog , volume =. Proceedings of the
-
[41]
Complex. arXiv:1801.10314 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[42]
Search-based. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , author =. 2017 , pages =. doi:10.18653/v1/P17-1167 , language =
-
[43]
The Web as a Knowledge-base for Answering Complex Questions
The. arXiv:1803.06643 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[44]
CoQA: A Conversational Question Answering Challenge
arXiv:1808.07042 [cs] , author =. 2018 , note =
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[45]
Non-Monotonic Sequential Text Generation , journal =
Sean Welleck and Kiant. Non-Monotonic Sequential Text Generation , journal =
-
[46]
Stance Detection with Bidirectional Conditional Encoding , booktitle =
Isabelle Augenstein and Tim Rockt. Stance Detection with Bidirectional Conditional Encoding , booktitle =. 2016 , crossref =
work page 2016
-
[47]
Learning Python Code Suggestion with a Sparse Pointer Network , journal =
Avishkar Bhoopchand and Tim Rockt. Learning Python Code Suggestion with a Sparse Pointer Network , journal =. 2016 , url =
work page 2016
-
[48]
Programming with a Differentiable Forth Interpreter , booktitle =
Matko Bosnjak and Tim Rockt. Programming with a Differentiable Forth Interpreter , booktitle =. 2017 , url =
work page 2017
-
[49]
Oana Camburu and Tim Rockt. Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 , year =
work page 2018
-
[50]
International Conference on Learning Representations (ICLR) , year=
Daniluk, Michal and Rockt. International Conference on Learning Representations (ICLR) , year=
-
[51]
Lifted Rule Injection for Relation Embeddings , booktitle =
Thomas Demeester and Tim Rockt. Lifted Rule Injection for Relation Embeddings , booktitle =. 2016 , crossref =
work page 2016
-
[52]
Regularizing Relation Representations by First-order Implications , booktitle =
Thomas Demeester and Tim Rockt. Regularizing Relation Representations by First-order Implications , booktitle =. 2016 , crossref =
work page 2016
-
[53]
EMNLP Workshop on Natural Language Processing for Social Media (SocialNLP) , year=
emoji2vec: Learning Emoji Representations from their Description , author=. EMNLP Workshop on Natural Language Processing for Social Media (SocialNLP) , year=
-
[54]
International Conference on Learning Representations (ICLR) , year =
Gregory Farquhar and Tim Rockt. International Conference on Learning Representations (ICLR) , year =
-
[55]
International Conference on Machine Learning (ICML) , year =
Jakob Foerster and Gregory Farquhar and Maruan Al-Shedivat and Tim Rockt. International Conference on Machine Learning (ICML) , year =
-
[56]
BioCreative Challenge Evaluation Workshop vol
Huber, Torsten and Rockt. BioCreative Challenge Evaluation Workshop vol. 2 , pages=
-
[57]
Generating Natural Language Inference Chains , journal =
Vladyslav Kolesnyk and Tim Rockt. Generating Natural Language Inference Chains , journal =. 2016 , url =
work page 2016
-
[58]
Martin Krallinger and Obdulia Rabal and Florian Leitner and Miguel Vazquez and David Salgado and Zhiyong Lu and Robert Leaman and Yanan Lu and Donghong Ji and Daniel M. Lowe and Roger A. Sayle and Riza Theresa Batista. The. J. Cheminformatics , volume =. 2015 , url =. doi:10.1186/1758-2946-7-S1-S2 , timestamp =
-
[59]
Adversarial Sets for Regularised Neural Link Predictors , booktitle =
Pasquale Minervini and Thomas Demeester and Tim Rockt. Adversarial Sets for Regularised Neural Link Predictors , booktitle =
-
[60]
Towards Neural Theorem Proving at Scale , booktitle =
Pasquale Minervini and Matko Bosnjak and Tim Rockt. Towards Neural Theorem Proving at Scale , booktitle =
-
[61]
International Workshop on Statistical Relational AI (StarAI) , year=
Sebastian Riedel and Sameer Singh and Vivek Srikumar and Tim Rockt. International Workshop on Statistical Relational AI (StarAI) , year=
-
[62]
Statistical Language and Speech Processing , pages=
Towards Two-Way Interaction with Reading Machines , author=. Statistical Language and Speech Processing , pages=. 2015 , publisher=
work page 2015
-
[63]
ChemSpot: a hybrid system for chemical named entity recognition , author=. Bioinformatics , volume=. 2012 , publisher=
work page 2012
-
[64]
Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=
Rockt. Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=
work page 2013
-
[65]
ACL Workshop on Semantic Parsing (SP'14) , year=
Rockt. ACL Workshop on Semantic Parsing (SP'14) , year=
-
[66]
Injecting Logical Background Knowledge into Embeddings for Relation Extraction , booktitle =
Tim Rockt. Injecting Logical Background Knowledge into Embeddings for Relation Extraction , booktitle =. 2015 , crossref =
work page 2015
-
[67]
Learning Knowledge Base Inference with Neural Theorem Provers , booktitle =
Tim Rockt. Learning Knowledge Base Inference with Neural Theorem Provers , booktitle =. 2016 , crossref =
work page 2016
-
[68]
International Conference on Learning Representations (ICLR) , year=
Reasoning about Entailment with Neural Attention , author=. International Conference on Learning Representations (ICLR) , year=
-
[69]
End-to-end Differentiable Proving , booktitle =
Tim Rockt. End-to-end Differentiable Proving , booktitle =. 2017 , crossref =
work page 2017
-
[70]
Interpretation of Natural Language Rules in Conversational Machine Reading , booktitle =
Marzieh Saeidi and Max Bartolo and Patrick Lewis and Sameer Singh and Tim Rockt. Interpretation of Natural Language Rules in Conversational Machine Reading , booktitle =
-
[71]
AAAI Spring Symposium on Knowledge Representation and Reasoning (KRR) , year=
Sanchez, Ivan and Rockt. AAAI Spring Symposium on Knowledge Representation and Reasoning (KRR) , year=
-
[72]
NIPS Workshop on Probabilistic Programming , year=
Singh, Sameer and Riedel, Sebastian and Hewitt, Luke and Rockt. NIPS Workshop on Probabilistic Programming , year=
-
[73]
NAACL Workshop on Vector Space Modeling for NLP (VSM) , year=
Singh, Sameer and Rockt. NAACL Workshop on Vector Space Modeling for NLP (VSM) , year=
-
[74]
Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=
Thomas, Philippe and Neves, Mariana and Rockt. Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=
work page 2013
-
[75]
SETH detects and normalizes genetic variants in text , author=. Bioinformatics , year=
-
[76]
ACL Workshop on Representation Learning for NLP (RepL4NLP) , year=
MuFuRU: The Multi-Function Recurrent Unit , author=. ACL Workshop on Representation Learning for NLP (RepL4NLP) , year=
-
[77]
Dirk Weissenborn and Pasquale Minervini and Isabelle Augenstein and Johannes Welbl and Tim Rockt. Jack the Reader -. Proceedings of. 2018 , crossref =
work page 2018
-
[78]
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , journal =
Mart. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , journal =. 2016 , url =
work page 2016
-
[79]
Yaser S. Abu. Learning from hints in neural networks , journal =. 1990 , url =. doi:10.1016/0885-064X(90)90006-Y , timestamp =
-
[80]
All-at-once Optimization for Coupled Matrix and Tensor Factorizations
All-at-once optimization for coupled matrix and tensor factorizations , author=. arXiv preprint arXiv:1105.3422 , year=
work page internal anchor Pith review Pith/arXiv arXiv
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.