arxiv: 1909.01066 · v2 · submitted 2019-09-03 · 💻 cs.CL

Recognition: 2 theorem links

· Lean Theorem

Language Models as Knowledge Bases?

Fabio Petroni , Tim Rockt\"aschel , Patrick Lewis , Anton Bakhtin , Yuxiang Wu , Alexander H. Miller , Sebastian Riedel

Authors on Pith no claims yet

Pith reviewed 2026-05-16 10:48 UTC · model grok-4.3

classification 💻 cs.CL

keywords language modelsrelational knowledgeknowledge basesBERTcloze statementsopen-domain QApretrained modelsunsupervised QA

0 comments

The pith

Pretrained language models already contain relational knowledge that can be accessed through fill-in-the-blank queries without any fine-tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Language models trained only on raw text appear to absorb factual relations from their training data. The paper converts known relations into cloze statements and measures how often the model correctly predicts the missing word. Without updates, BERT matches the accuracy of traditional systems that receive some oracle knowledge from structured sources. This holds for both relation-specific probes and open-domain question answering. Different classes of facts are recalled at noticeably different rates, suggesting pretraining favors certain patterns over others.

Core claim

Without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge. BERT also does remarkably well on open-domain question answering against a supervised baseline. Certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The ability to recall facts through cloze statements demonstrates the potential of these models as unsupervised open-domain QA systems.

What carries the argument

Cloze-statement probing, in which known facts are turned into fill-in-the-blank sentences to measure how often the model recalls the correct object without any task-specific training.

If this is right

Language models could serve directly as open-domain QA systems with no additional supervision or schema design.
Querying an open class of relations becomes possible without engineering a fixed knowledge-base structure.
Extending the knowledge store requires only additional text data rather than manual curation.
Some factual domains are captured more reliably than others during ordinary pretraining, guiding where extra effort may be needed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If cloze performance tracks genuine knowledge, then larger models or more diverse text should improve recall on the weaker categories.
The same probing method could be used to audit what biases or gaps exist in the knowledge absorbed from public web text.
Testing whether performance holds when questions are asked in conversational form rather than fixed cloze templates would clarify the practical reach of this capability.

Load-bearing premise

That success on cloze statements directly reflects stored factual knowledge rather than surface patterns or repeated co-occurrences in the training text.

What would settle it

A controlled experiment in which the model succeeds on the original cloze statements but fails when the same facts are rephrased into new sentences that preserve meaning but avoid exact training-data wording.

read the original abstract

Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and may be able to answer queries structured as "fill-in-the-blank" cloze statements. Language models have many advantages over structured knowledge bases: they require no schema engineering, allow practitioners to query about an open class of relations, are easy to extend to more data, and require no human supervision to train. We present an in-depth analysis of the relational knowledge already present (without fine-tuning) in a wide range of state-of-the-art pretrained language models. We find that (i) without fine-tuning, BERT contains relational knowledge competitive with traditional NLP methods that have some access to oracle knowledge, (ii) BERT also does remarkably well on open-domain question answering against a supervised baseline, and (iii) certain types of factual knowledge are learned much more readily than others by standard language model pretraining approaches. The surprisingly strong ability of these models to recall factual knowledge without any fine-tuning demonstrates their potential as unsupervised open-domain QA systems. The code to reproduce our analysis is available at https://github.com/facebookresearch/LAMA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that pretrained language models such as BERT encode relational knowledge extractable via cloze prompts without fine-tuning. On the T-REx benchmark it reports competitive precision@1 against traditional NLP pipelines that receive some oracle knowledge; it also shows competitive results on open-domain QA against a supervised baseline and notes that some relation types are learned more readily than others during standard pretraining. Reproducible code is released.

Significance. If the results hold, the work indicates that large LMs can function as flexible, unsupervised knowledge bases, removing the need for schema engineering and human supervision. This has implications for open-domain QA and knowledge extraction pipelines. The explicit release of reproduction code is a clear strength that supports verification and extension.

major comments (2)

[T-REx evaluation] T-REx evaluation section: the central claim that cloze-statement precision@1 measures internalized relational knowledge (rather than memorized subject-object co-occurrences from Wikipedia pretraining) is not supported by any ablation that severs surface statistics while preserving the underlying facts (e.g., entity swapping or context randomization). The reported prompt sensitivity and per-relation variation do not resolve this distinction and directly affect the competitiveness interpretation.
[Baseline comparison] Baseline comparison (around the main results table): the exact nature of 'oracle knowledge' granted to the traditional NLP methods is not defined with sufficient precision to guarantee a fair comparison; without this, the claim that BERT is competitive cannot be fully evaluated.

minor comments (2)

[Abstract] Abstract: the claim of evaluating 'a wide range of state-of-the-art pretrained language models' is not matched by the depth of results, which focus primarily on BERT; a clarifying sentence would improve accuracy.
[Figures/Tables] Figure and table captions: several lack explicit statements of what the y-axis or metric represents; this reduces immediate readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the scope and limitations of our evaluation. We address each major point below, providing additional context from the manuscript and indicating where revisions will be made to improve precision.

read point-by-point responses

Referee: [T-REx evaluation] T-REx evaluation section: the central claim that cloze-statement precision@1 measures internalized relational knowledge (rather than memorized subject-object co-occurrences from Wikipedia pretraining) is not supported by any ablation that severs surface statistics while preserving the underlying facts (e.g., entity swapping or context randomization). The reported prompt sensitivity and per-relation variation do not resolve this distinction and directly affect the competitiveness interpretation.

Authors: We agree that an explicit ablation such as entity swapping would provide stronger evidence for distinguishing relational knowledge from surface co-occurrence statistics. Our current analysis relies on prompt sensitivity (different templates for the same fact yielding different P@1 scores) and large per-relation variance to argue against pure memorization of subject-object pairs. While these observations are consistent with the model learning relational patterns rather than rote co-occurrences, they do not fully rule out the confound raised. We will add an explicit limitations paragraph acknowledging this gap and will include a small-scale entity-swapping experiment in the revised manuscript to directly test the claim. revision: partial
Referee: [Baseline comparison] Baseline comparison (around the main results table): the exact nature of 'oracle knowledge' granted to the traditional NLP methods is not defined with sufficient precision to guarantee a fair comparison; without this, the claim that BERT is competitive cannot be fully evaluated.

Authors: We appreciate the request for greater precision. In the manuscript, the traditional pipelines receive gold subject and object entity spans plus the relation type (i.e., they are given the entities and asked only to predict the relation label), which constitutes the 'oracle' information. We will revise the baseline description section and the caption of the main results table to state this explicitly, including the exact inputs provided to each system (e.g., whether entity linking or coreference resolution is bypassed). This clarification will allow readers to assess the comparison directly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical probing of held-out facts against external baselines

full rationale

The paper reports direct accuracy measurements of pretrained LMs on cloze templates drawn from the T-REx dataset (held-out facts) and compares them to independent supervised NLP baselines that have oracle access. No equations, fitted parameters, or derivations are presented; the central claim is an empirical observation rather than a reduction to self-defined quantities or self-citation chains. The evaluation protocol uses external benchmarks and does not rename or smuggle in prior results as new predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that cloze accuracy reflects internalized relational knowledge rather than shallow correlations. No free parameters are introduced or fitted in the reported analysis. No new entities are postulated.

axioms (1)

domain assumption Cloze-statement accuracy on held-out facts measures factual knowledge stored during pretraining
Invoked when interpreting model predictions as evidence of relational knowledge without additional supervision.

pith-pipeline@v0.9.0 · 5530 in / 1151 out tokens · 25395 ms · 2026-05-16T10:48:00.004958+00:00 · methodology

discussion (0)

Forward citations

Cited by 18 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

REALM: Retrieval-Augmented Language Model Pre-Training
cs.CL 2020-02 accept novelty 8.0

REALM augments language-model pre-training with an unsupervised retriever over Wikipedia documents and reports 4-16% absolute gains on open-domain QA benchmarks over prior implicit and explicit knowledge methods.
BOOKMARKS: Efficient Active Storyline Memory for Role-playing
cs.CL 2026-05 unverdicted novelty 7.0

BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.
Privacy Without Losing Place: A Paradigm for Private Retrieval in Spatial RAGs
cs.CR 2026-05 unverdicted novelty 7.0

PAS encodes locations via relative anchors and bins to deliver roughly 370-400m adversarial error in spatial RAG while retaining over half the baseline retrieval performance and keeping generation quality robust.
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration
cs.CL 2026-04 unverdicted novelty 7.0

RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
Graph Topology Information Enhanced Heterogeneous Graph Representation Learning
cs.LG 2026-04 unverdicted novelty 7.0

ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
cs.CV 2022-04 unverdicted novelty 7.0

Socratic Models compose zero-shot multimodal reasoning by prompting pretrained language and vision models to exchange information and enable new capabilities without finetuning.
Towards Understanding Continual Factual Knowledge Acquisition of Language Models: From Theory to Algorithm
cs.CL 2026-05 unverdicted novelty 6.0

Theoretical analysis of continual factual knowledge acquisition shows data replay stabilizes pretrained knowledge by shifting convergence dynamics while regularization only slows forgetting, leading to the STOC method...
TLoRA: Task-aware Low Rank Adaptation of Large Language Models
cs.CL 2026-04 unverdicted novelty 6.0

TLoRA jointly optimizes LoRA initialization via task-data SVD and sensitivity-driven rank allocation, delivering stronger results than standard LoRA across NLU, reasoning, math, code, and chat tasks while using fewer ...
RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
cs.CL 2024-01 unverdicted novelty 6.0

RAPTOR introduces a tree-organized retrieval method using recursive abstractive summaries, achieving a 20% absolute accuracy improvement on the QuALITY benchmark when paired with GPT-4.
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
cs.CL 2023-10 unverdicted novelty 6.0

Self-RAG trains LLMs to adaptively retrieve passages on demand and self-critique using reflection tokens, outperforming ChatGPT and retrieval-augmented Llama2 on QA, reasoning, and fact verification.
Inner Monologue: Embodied Reasoning through Planning with Language Models
cs.RO 2022-07 unverdicted novelty 6.0

LLMs form an inner monologue from closed-loop language feedback to improve high-level instruction completion in simulated and real robotic rearrangement and kitchen manipulation tasks.
MRKL Systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
cs.CL 2022-05 unverdicted novelty 6.0

MRKL is a modular neuro-symbolic architecture that integrates LLMs with external knowledge and discrete reasoning to overcome limitations of pure neural language models.
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
cs.CL 2020-02 unverdicted novelty 6.0

CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
cs.CL 2020-02 accept novelty 6.0

Fine-tuned language models store knowledge in parameters to answer questions competitively with retrieval-based open-domain QA systems.
Structural Ranking of the Cognitive Plausibility of Computational Models of Analogy and Metaphors with the Minimal Cognitive Grid
cs.AI 2026-05 unverdicted novelty 5.0

A formalized Minimal Cognitive Grid ranks computational models of analogy and metaphor by alignment with cognitive theories using Functional/Structural Ratio, Generality, and Performance Match dimensions.
Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations
cs.CL 2026-03 unverdicted novelty 5.0

CRVA-TGRAG combines parent-document segmentation, ensemble retrieval, and teacher-guided fine-tuning to mitigate knowledge conflicts and improve accuracy in LLM-based CVE vulnerability analysis.
ARIA: Adaptive Retrieval Intelligence Assistant -- A Multimodal RAG Framework for Domain-Specific Engineering Education
cs.IR 2026-02 conditional novelty 5.0

ARIA is a multimodal RAG framework that filters domain-specific questions with 97.5% accuracy and outperforms ChatGPT-5 on pedagogical quality for a university civil engineering course.
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
cs.CL 2025-02 unverdicted novelty 5.0

SmolLM2 is a 1.7B-parameter language model that outperforms Qwen2.5-1.5B and Llama3.2-1B after overtraining on 11 trillion tokens using custom FineMath, Stack-Edu, and SmolTalk datasets in a multi-stage pipeline.

Reference graph

Works this paper leans on

300 extracted references · 300 canonical work pages · cited by 18 Pith papers · 41 internal anchors

[1]

Overview of the

Surdeanu, Mihai and Ji, Heng , year =. Overview of the

work page
[2]

Learning and Evaluating General Linguistic Intelligence

Learning and. arXiv:1901.11373 [cs, stat] , author =. 2019 , note =

work page internal anchor Pith review Pith/arXiv arXiv 1901
[3]

Lample, Guillaume and Ott, Myle and Conneau, Alexis and Denoyer, Ludovic and Ranzato, Marc'Aurelio , month = nov, year =. Phrase-. Proceedings of the 2018

work page 2018
[4]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

arXiv:1810.04805 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

Transactions of the Association of Computational Linguistics , author =

Natural. Transactions of the Association of Computational Linguistics , author =

work page
[6]

Rajpurkar, Pranav and Jia, Robin and Liang, Percy , month = jul, year =. Know. Proceedings of the 56th

work page
[7]

Proceedings of the 2016

Rajpurkar, Pranav and Zhang, Jian and Lopyrev, Konstantin and Liang, Percy , month = nov, year =. Proceedings of the 2016

work page 2016
[8]

arXiv:1703.04730 [cs, stat] , author =

Understanding. arXiv:1703.04730 [cs, stat] , author =. 2017 , note =

work page arXiv 2017
[9]

Relational inductive biases, deep learning, and graph networks

Relational inductive biases, deep learning, and graph networks , url =. arXiv:1806.01261 [cs, stat] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[10]

Grounding Visual Explanations

Grounding. arXiv:1807.09685 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

Women also Snowboard: Overcoming Bias in Captioning Models

Women also. arXiv:1803.09797 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[12]

Distributional vectors encode referential attributes , url =

Gupta, Abhijeet and Boleda, Gemma and Baroni, Marco and Padó, Sebastian , month = sep, year =. Distributional vectors encode referential attributes , url =. Proceedings of the 2015

work page 2015
[13]

Distributed

Gupta, Abhijeet and Boleda, Gemma and Padó, Sebastian , month = aug, year =. Distributed. Proceedings of the 6th

work page
[14]

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-. arXiv:1804.07755 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Model-. arXiv:1703.03400 [cs] , author =. 2017 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2017
[16]

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

arXiv:1802.05098 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[17]

Proceedings of the 2015 conference on empirical methods in natural language processing , author =

Question-answer driven semantic role labeling:. Proceedings of the 2015 conference on empirical methods in natural language processing , author =. 2015 , pages =

work page 2015
[18]

Improving Entity Linking by Modeling Latent Relations between Mentions

Improving. arXiv preprint arXiv:1804.10637 , author =

work page internal anchor Pith review Pith/arXiv arXiv
[19]

and Davis, Ernest and Morgenstern, Leora , year =

Levesque, Hector J. and Davis, Ernest and Morgenstern, Leora , year =. The. Aaai spring symposium:

work page
[20]

Easy victories and uphill battles in coreference resolution , booktitle =

Durrett, Greg and Klein, Dan , year =. Easy victories and uphill battles in coreference resolution , booktitle =

work page
[21]

Advances in neural information processing systems , author =

Data programming:. Advances in neural information processing systems , author =. 2016 , pages =

work page 2016
[22]

Modeling relations and their mentions without labeled text , booktitle =

Riedel, Sebastian and Yao, Limin and McCallum, Andrew , year =. Modeling relations and their mentions without labeled text , booktitle =

work page
[23]

Distant supervision for relation extraction without labeled data , booktitle =

Mintz, Mike and Bills, Steven and Snow, Rion and Jurafsky, Dan , year =. Distant supervision for relation extraction without labeled data , booktitle =

work page
[24]

Neural relation extraction with selective attention over instances , volume =

Lin, Yankai and Shen, Shiqi and Liu, Zhiyuan and Luan, Huanbo and Sun, Maosong , year =. Neural relation extraction with selective attention over instances , volume =. Proceedings of the 54th

work page
[25]

Teaching machines to read and comprehend , booktitle =

Hermann, Karl Moritz and Kocisky, Tomas and Grefenstette, Edward and Espeholt, Lasse and Kay, Will and Suleyman, Mustafa and Blunsom, Phil , year =. Teaching machines to read and comprehend , booktitle =

work page
[26]

and Mitchell, Tom M

Carlson, Andrew and Betteridge, Justin and Kisiel, Bryan and Settles, Burr and Hruschka Jr, Estevam R. and Mitchell, Tom M. , year =. Toward an architecture for never-ending language learning. , volume =

work page
[27]

and Soderland, Stephen and Broadhead, Matthew and Etzioni, Oren , year =

Banko, Michele and Cafarella, Michael J. and Soderland, Stephen and Broadhead, Matthew and Etzioni, Oren , year =. Open information extraction from the web. , volume =

work page
[28]

Identification and verification of simple claims about statistical properties , booktitle =

Vlachos, Andreas and Riedel, Sebastian , year =. Identification and verification of simple claims about statistical properties , booktitle =

work page
[29]

Fact checking:

Vlachos, Andreas and Riedel, Sebastian , year =. Fact checking:. Proceedings of the

work page
[30]

Annals of the New York Academy of Sciences , author =

Knowledge. Annals of the New York Academy of Sciences , author =. 1984 , pages =. doi:10.1111/j.1749-6632.1984.tb16513.x , language =

work page doi:10.1111/j.1749-6632.1984.tb16513.x 1984
[31]

Learning to

Rao, Sudha and Daume III, Hal , month = jul, year =. Learning to. Proceedings of the 56th

work page
[32]

Transactions of the Association for Computational Linguistics , author =

Constructing. Transactions of the Association for Computational Linguistics , author =. 2018 , pages =

work page 2018
[33]

Strategies for natural language processing , author =

Beyond question answering , volume =. Strategies for natural language processing , author =

work page
[34]

Dialogue & Discourse , author =

A. Dialogue & Discourse , author =. 2018 , pages =

work page 2018
[35]

and Pineau, Joelle , year =

Serban, Iulian Vlad and Sordoni, Alessandro and Bengio, Yoshua and Courville, Aaron C. and Pineau, Joelle , year =. Building

work page
[36]

1966 , pages =

Communications of the ACM , author =. 1966 , pages =

work page 1966
[37]

Choi, Eunsol and He, He and Iyyer, Mohit and Yatskar, Mark and Yih, Wen-tau and Choi, Yejin and Liang, Percy and Zettlemoyer, Luke , month = aug, year =

work page
[38]

Li, Jiwei and Monroe, Will and Ritter, Alan and Jurafsky, Dan and Galley, Michel and Gao, Jianfeng , month = nov, year =. Deep. Proceedings of the 2016

work page 2016
[39]

, month = jul, year =

Ritter, Alan and Cherry, Colin and Dolan, William B. , month = jul, year =. Data-. Proceedings of the 2011

work page 2011
[40]

Visual dialog , volume =

Das, Abhishek and Kottur, Satwik and Gupta, Khushi and Singh, Avi and Yadav, Deshraj and Moura, José MF and Parikh, Devi and Batra, Dhruv , year =. Visual dialog , volume =. Proceedings of the

work page
[41]

Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph

Complex. arXiv:1801.10314 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[42]

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , author =

Search-based. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , author =. 2017 , pages =. doi:10.18653/v1/P17-1167 , language =

work page doi:10.18653/v1/p17-1167 2017
[43]

The Web as a Knowledge-base for Answering Complex Questions

The. arXiv:1803.06643 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[44]

CoQA: A Conversational Question Answering Challenge

arXiv:1808.07042 [cs] , author =. 2018 , note =

work page internal anchor Pith review Pith/arXiv arXiv 2018
[45]

Non-Monotonic Sequential Text Generation , journal =

Sean Welleck and Kiant. Non-Monotonic Sequential Text Generation , journal =

work page
[46]

Stance Detection with Bidirectional Conditional Encoding , booktitle =

Isabelle Augenstein and Tim Rockt. Stance Detection with Bidirectional Conditional Encoding , booktitle =. 2016 , crossref =

work page 2016
[47]

Learning Python Code Suggestion with a Sparse Pointer Network , journal =

Avishkar Bhoopchand and Tim Rockt. Learning Python Code Suggestion with a Sparse Pointer Network , journal =. 2016 , url =

work page 2016
[48]

Programming with a Differentiable Forth Interpreter , booktitle =

Matko Bosnjak and Tim Rockt. Programming with a Differentiable Forth Interpreter , booktitle =. 2017 , url =

work page 2017
[49]

Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 , year =

Oana Camburu and Tim Rockt. Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 , year =

work page 2018
[50]

International Conference on Learning Representations (ICLR) , year=

Daniluk, Michal and Rockt. International Conference on Learning Representations (ICLR) , year=

work page
[51]

Lifted Rule Injection for Relation Embeddings , booktitle =

Thomas Demeester and Tim Rockt. Lifted Rule Injection for Relation Embeddings , booktitle =. 2016 , crossref =

work page 2016
[52]

Regularizing Relation Representations by First-order Implications , booktitle =

Thomas Demeester and Tim Rockt. Regularizing Relation Representations by First-order Implications , booktitle =. 2016 , crossref =

work page 2016
[53]

EMNLP Workshop on Natural Language Processing for Social Media (SocialNLP) , year=

emoji2vec: Learning Emoji Representations from their Description , author=. EMNLP Workshop on Natural Language Processing for Social Media (SocialNLP) , year=

work page
[54]

International Conference on Learning Representations (ICLR) , year =

Gregory Farquhar and Tim Rockt. International Conference on Learning Representations (ICLR) , year =

work page
[55]

International Conference on Machine Learning (ICML) , year =

Jakob Foerster and Gregory Farquhar and Maruan Al-Shedivat and Tim Rockt. International Conference on Machine Learning (ICML) , year =

work page
[56]

BioCreative Challenge Evaluation Workshop vol

Huber, Torsten and Rockt. BioCreative Challenge Evaluation Workshop vol. 2 , pages=

work page
[57]

Generating Natural Language Inference Chains , journal =

Vladyslav Kolesnyk and Tim Rockt. Generating Natural Language Inference Chains , journal =. 2016 , url =

work page 2016
[58]

Lowe and Roger A

Martin Krallinger and Obdulia Rabal and Florian Leitner and Miguel Vazquez and David Salgado and Zhiyong Lu and Robert Leaman and Yanan Lu and Donghong Ji and Daniel M. Lowe and Roger A. Sayle and Riza Theresa Batista. The. J. Cheminformatics , volume =. 2015 , url =. doi:10.1186/1758-2946-7-S1-S2 , timestamp =

work page doi:10.1186/1758-2946-7-s1-s2 2015
[59]

Adversarial Sets for Regularised Neural Link Predictors , booktitle =

Pasquale Minervini and Thomas Demeester and Tim Rockt. Adversarial Sets for Regularised Neural Link Predictors , booktitle =

work page
[60]

Towards Neural Theorem Proving at Scale , booktitle =

Pasquale Minervini and Matko Bosnjak and Tim Rockt. Towards Neural Theorem Proving at Scale , booktitle =

work page
[61]

International Workshop on Statistical Relational AI (StarAI) , year=

Sebastian Riedel and Sameer Singh and Vivek Srikumar and Tim Rockt. International Workshop on Statistical Relational AI (StarAI) , year=

work page
[62]

Statistical Language and Speech Processing , pages=

Towards Two-Way Interaction with Reading Machines , author=. Statistical Language and Speech Processing , pages=. 2015 , publisher=

work page 2015
[63]

Bioinformatics , volume=

ChemSpot: a hybrid system for chemical named entity recognition , author=. Bioinformatics , volume=. 2012 , publisher=

work page 2012
[64]

Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=

Rockt. Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=

work page 2013
[65]

ACL Workshop on Semantic Parsing (SP'14) , year=

Rockt. ACL Workshop on Semantic Parsing (SP'14) , year=

work page
[66]

Injecting Logical Background Knowledge into Embeddings for Relation Extraction , booktitle =

Tim Rockt. Injecting Logical Background Knowledge into Embeddings for Relation Extraction , booktitle =. 2015 , crossref =

work page 2015
[67]

Learning Knowledge Base Inference with Neural Theorem Provers , booktitle =

Tim Rockt. Learning Knowledge Base Inference with Neural Theorem Provers , booktitle =. 2016 , crossref =

work page 2016
[68]

International Conference on Learning Representations (ICLR) , year=

Reasoning about Entailment with Neural Attention , author=. International Conference on Learning Representations (ICLR) , year=

work page
[69]

End-to-end Differentiable Proving , booktitle =

Tim Rockt. End-to-end Differentiable Proving , booktitle =. 2017 , crossref =

work page 2017
[70]

Interpretation of Natural Language Rules in Conversational Machine Reading , booktitle =

Marzieh Saeidi and Max Bartolo and Patrick Lewis and Sameer Singh and Tim Rockt. Interpretation of Natural Language Rules in Conversational Machine Reading , booktitle =

work page
[71]

AAAI Spring Symposium on Knowledge Representation and Reasoning (KRR) , year=

Sanchez, Ivan and Rockt. AAAI Spring Symposium on Knowledge Representation and Reasoning (KRR) , year=

work page
[72]

NIPS Workshop on Probabilistic Programming , year=

Singh, Sameer and Riedel, Sebastian and Hewitt, Luke and Rockt. NIPS Workshop on Probabilistic Programming , year=

work page
[73]

NAACL Workshop on Vector Space Modeling for NLP (VSM) , year=

Singh, Sameer and Rockt. NAACL Workshop on Vector Space Modeling for NLP (VSM) , year=

work page
[74]

Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=

Thomas, Philippe and Neves, Mariana and Rockt. Proceedings of the 7th International Workshop on Semantic Evaluation (SemEval 2013) , year=

work page 2013
[75]

Bioinformatics , year=

SETH detects and normalizes genetic variants in text , author=. Bioinformatics , year=

work page
[76]

ACL Workshop on Representation Learning for NLP (RepL4NLP) , year=

MuFuRU: The Multi-Function Recurrent Unit , author=. ACL Workshop on Representation Learning for NLP (RepL4NLP) , year=

work page
[77]

Jack the Reader -

Dirk Weissenborn and Pasquale Minervini and Isabelle Augenstein and Johannes Welbl and Tim Rockt. Jack the Reader -. Proceedings of. 2018 , crossref =

work page 2018
[78]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , journal =

Mart. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , journal =. 2016 , url =

work page 2016
[79]

Yaser S. Abu. Learning from hints in neural networks , journal =. 1990 , url =. doi:10.1016/0885-064X(90)90006-Y , timestamp =

work page doi:10.1016/0885-064x(90)90006-y 1990
[80]

All-at-once Optimization for Coupled Matrix and Tensor Factorizations

All-at-once optimization for coupled matrix and tensor factorizations , author=. arXiv preprint arXiv:1105.3422 , year=

work page internal anchor Pith review Pith/arXiv arXiv

Showing first 80 references.