Zero-Shot Open Entity Typing as Type-Compatible Grounding

Ben Zhou; Chen-Tse Tsai; Daniel Khashabi; Dan Roth

arxiv: 1907.03228 · v1 · pith:5LCOAVZKnew · submitted 2019-07-07 · 💻 cs.CL

Zero-Shot Open Entity Typing as Type-Compatible Grounding

Ben Zhou , Daniel Khashabi , Chen-Tse Tsai , Dan Roth This is my paper

Pith reviewed 2026-05-25 01:49 UTC · model grok-4.3

classification 💻 cs.CL

keywords zero-shot entity typingopen entity typingtype-compatible groundingFreebase typesWikipedia groundingfine-grained typingnamed entity recognitionzero-shot learning

0 comments

The pith

Entity types can be inferred zero-shot by grounding mentions to compatible Wikipedia entries and using their Freebase types.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a method for entity typing that needs no annotated training data at all. Types are defined as Boolean functions over Freebase types, and each mention is linked to a set of Wikipedia entries whose types satisfy those functions. An inference algorithm then determines the mention's types directly from the types attached to the grounded entries. This design supports entirely new type taxonomies and transfers across text domains without retraining. The resulting system matches supervised named-entity recognizers on standard datasets while beating them on out-of-domain text and outperforming prior zero-shot typing methods.

Core claim

Given a type taxonomy defined as Boolean functions of Freebase types, a mention is grounded to a set of type-compatible Wikipedia entries and the target mention's types are inferred using an inference algorithm that makes use of the types of these entries. The approach requires no annotated data and can identify newly defined types.

What carries the argument

Type-compatible grounding to Wikipedia entries, which selects pages whose Freebase types satisfy the Boolean functions that define the target types and supplies those types to the inference step.

If this is right

New type taxonomies can be introduced simply by writing Boolean functions over Freebase types, without collecting new annotations.
The same model handles both fine-grained and coarse-grained typing and works in domains such as biology.
Performance remains competitive with supervised named-entity recognition systems on in-domain data.
Performance exceeds supervised systems on out-of-domain datasets.
The method significantly exceeds the accuracy of earlier zero-shot fine-typing systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same grounding-plus-inference pattern could be reused for other open information-extraction tasks that currently require task-specific labeled data.
Replacing or augmenting Wikipedia with additional knowledge bases might increase coverage for rare or technical entities.
The separation of type definitions from training data suggests a route toward fully open-world entity typing that accepts arbitrary new type inventories at test time.

Load-bearing premise

The inference algorithm can reliably determine a mention's types solely from the Freebase types of Wikipedia entries that are judged type-compatible.

What would settle it

A dataset of mentions whose gold types cannot be recovered by any Boolean combination of Freebase types drawn from the Wikipedia entries that the grounding step selects.

read the original abstract

The problem of entity-typing has been studied predominantly in supervised learning fashion, mostly with task-specific annotations (for coarse types) and sometimes with distant supervision (for fine types). While such approaches have strong performance within datasets, they often lack the flexibility to transfer across text genres and to generalize to new type taxonomies. In this work we propose a zero-shot entity typing approach that requires no annotated data and can flexibly identify newly defined types. Given a type taxonomy defined as Boolean functions of FREEBASE "types", we ground a given mention to a set of type-compatible Wikipedia entries and then infer the target mention's types using an inference algorithm that makes use of the types of these entries. We evaluate our system on a broad range of datasets, including standard fine-grained and coarse-grained entity typing datasets, and also a dataset in the biological domain. Our system is shown to be competitive with state-of-the-art supervised NER systems and outperforms them on out-of-domain datasets. We also show that our system significantly outperforms other zero-shot fine typing systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Zero-shot grounding to Wikipedia plus Freebase Boolean inference gives a practical route to flexible entity typing, but the abstract leaves the key inference step unexamined.

read the letter

The paper introduces a zero-shot entity typing method that defines types as Boolean functions over Freebase, grounds a mention to type-compatible Wikipedia entries, and infers the mention's types from those entries' Freebase types. No annotated data is needed, and the taxonomy can be swapped without retraining. That framing is the main new piece. It reports competitive results against supervised NER systems on standard datasets, better performance on out-of-domain data, and clear gains over prior zero-shot fine-typing systems, plus a test in the biological domain. Those are useful signals for anyone who wants to avoid annotation bottlenecks or handle new type sets. The experiments span multiple datasets, which is a plus. The soft spot is exactly the one flagged in the stress-test note: the abstract gives no numbers on grounding precision, no error analysis, and no ablation that isolates whether the inference step actually works when the Wikipedia links are imperfect. If grounding quality drops for rare or out-of-domain mentions, the downstream Boolean inference cannot be trusted, yet nothing in the provided summary checks that. The claims rest on that untested transfer. This is for people building annotation-light pipelines or testing new domains and taxonomies. A reader who needs concrete numbers or wants to reproduce the inference algorithm will have to wait for the full paper, but the idea itself is worth a serious referee's time. I would send it to review and ask specifically for grounding accuracy metrics and an ablation on the inference component.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a zero-shot open entity typing method that defines a type taxonomy as Boolean functions over Freebase types, grounds each mention to a set of type-compatible Wikipedia entries, and infers the mention's types via an algorithm that uses the Freebase types of the grounded entries. It reports evaluation on standard fine-grained/coarse-grained entity typing datasets plus a biological-domain dataset, claiming competitiveness with supervised NER systems (and superiority on out-of-domain data) as well as significant gains over prior zero-shot fine-typing systems.

Significance. If the grounding step and subsequent inference are shown to be reliable, the approach would offer a genuinely annotation-free route to open typing that transfers across genres and accommodates new taxonomies by leveraging existing KB resources; the explicit use of type-compatibility grounding distinguishes it from purely embedding-based zero-shot baselines.

major comments (2)

[Abstract] Abstract: the strongest claims (competitive with supervised NER, strong out-of-domain gains, and superiority to other zero-shot systems) rest on the inference step that 'makes use of the types of these entries.' No quantitative grounding precision, type-compatibility judgment accuracy, or ablation isolating the grounding-to-inference pipeline is supplied, leaving open whether noisy grounding for out-of-domain or rare mentions undermines the reported results.
Inference algorithm description (wherever presented): the premise that types can be determined solely from Freebase types of type-compatible Wikipedia entries requires explicit validation; without precision/recall figures on the grounding stage or an error analysis showing how type-compatibility errors propagate, the out-of-domain and zero-shot superiority claims cannot be assessed.

minor comments (1)

[Abstract] Abstract supplies no numerical results, error analysis, or inference-algorithm pseudocode, forcing readers to consult later sections for any verification of the stated performance claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments on the grounding and inference validation. We address each major comment below and will revise the manuscript to incorporate additional analysis.

read point-by-point responses

Referee: [Abstract] Abstract: the strongest claims (competitive with supervised NER, strong out-of-domain gains, and superiority to other zero-shot systems) rest on the inference step that 'makes use of the types of these entries.' No quantitative grounding precision, type-compatibility judgment accuracy, or ablation isolating the grounding-to-inference pipeline is supplied, leaving open whether noisy grounding for out-of-domain or rare mentions undermines the reported results.

Authors: We acknowledge that the manuscript presents only end-to-end typing results and does not report separate precision or recall for the grounding stage or an explicit ablation of the grounding-to-inference pipeline. The competitive and out-of-domain performance is offered as indirect evidence that the pipeline functions reliably, but we agree this leaves the claims open to the concern raised. In revision we will add quantitative grounding accuracy figures and an ablation isolating the inference step. revision: yes
Referee: [—] Inference algorithm description (wherever presented): the premise that types can be determined solely from Freebase types of type-compatible Wikipedia entries requires explicit validation; without precision/recall figures on the grounding stage or an error analysis showing how type-compatibility errors propagate, the out-of-domain and zero-shot superiority claims cannot be assessed.

Authors: The current version does not include isolated validation of grounding precision/recall or an error-propagation analysis. While the multi-domain results (including the biological dataset) provide support for the overall approach, we concur that direct validation would allow stronger assessment of the claims. We will add both grounding-stage metrics and an error analysis in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external resources and benchmarks

full rationale

The paper's zero-shot method grounds mentions to Wikipedia entries and infers types from their Freebase types via an external inference algorithm, then evaluates performance on independent datasets including out-of-domain and biological ones. No derivation step reduces by construction to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations; claims are supported by external KB resources and cross-dataset comparisons rather than internal tautologies.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that Wikipedia entries carry accurate Freebase type labels usable for inference and that type compatibility can be operationalized without training data.

axioms (1)

domain assumption Wikipedia entries provide accurate and sufficient Freebase type information that can be aggregated via an inference algorithm to label new mentions.
Invoked when the abstract describes grounding mentions to Wikipedia entries and inferring target types from those entries.

pith-pipeline@v0.9.0 · 5711 in / 1260 out tokens · 27632 ms · 2026-05-25T01:49:25.477907+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 2 canonical work pages · 2 internal anchors

[1]

Context-Dependent Fine-Grained Entity Type Tagging

Fine-grained entity type classiﬁcation by jointly learning representations and label embed- dings. In Proceedings of the 15th Conference of the European Chapter of the Association for Computa- tional Linguistics: Volume 1, Long Papers , pages 797–807, Valencia, Spain. Association for Compu- tational Linguistics. Kurt D. Bollacker, Colin Evans, Praveen Par...

work page internal anchor Pith review Pith/arXiv arXiv 2008
[2]

In HLT- NAACL

Ontonotes: The 90% solution. In HLT- NAACL. Lifu Huang, Jonathan May, Xiaoman Pan, and Heng Ji
[3]

Building a Fine-Grained Entity Typing System Overnight for a New X (X = Language, Domain, Genre)

Building a ﬁne-grained entity typing system overnight for a new x (x= language, domain, genre). arXiv preprint arXiv:1603.03112. Daniel Khashabi, Mark Sammons, Ben Zhou, Tom Redman, Christos Christodoulopoulos, Vivek Sriku- mar, Nicholas Rizzolo, Lev Ratinov, Guanheng Luo, Quang Do, Chen-Tse Tsai, Subhro Roy, Stephen Mayhew, Zhilli Feng, John Wieting, Xia...

work page internal anchor Pith review Pith/arXiv arXiv 2018
[4]

In Proceedings of the ninth conference on European chapter of the Association for Computational Lin- guistics, pages 1–8

Named entity recognition without gazetteers. In Proceedings of the ninth conference on European chapter of the Association for Computational Lin- guistics, pages 1–8. Association for Computational Linguistics. George Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM , 38(11):39– 41. David Nadeau, Peter D Turney, and Stan Mat...

1995
[5]

In Proceedings of the Australasian Language Technology Association Workshop 2008 , pages 124–132

Transforming wikipedia into named entity training data. In Proceedings of the Australasian Language Technology Association Workshop 2008 , pages 124–132. Joel Nothman, Tara Murphy, and James R Curran

2008
[6]

In Proceedings of the 12th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 612–620

Analysing wikipedia and gold-standard cor- pora for ner training. In Proceedings of the 12th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 612–620. Association for Computational Linguistics. Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, and Tom M. Mitchell. 2009. Zero-shot learning with semantic output ...

2009
[7]

In Proceedings of the International Conference on Machine Learning (ICML)

An embarrassingly simple approach to zero- shot learning. In Proceedings of the International Conference on Machine Learning (ICML). Dan Roth. 2017. Incidental supervision: Moving be- yond supervised learning. In Proc. of the Confer- ence on Artiﬁcial Intelligence (AAAI). Satoshi Sekine, Kiyoshi Sudo, and Chikashi Nobata

2017
[8]

In Pro- ceedings of the Ninth International Conference on Language Resources and Evaluation (LREC)

Extended named entity hierarchy. In Pro- ceedings of the Ninth International Conference on Language Resources and Evaluation (LREC). Sonse Shimaoka, Pontus Stenetorp, Kentaro Inui, and Sebastian Riedel. 2017. Neural architectures for ﬁne-grained entity type classiﬁcation. In Proceed- ings of the 15th Conference of the European Chap- ter of the Association...

2017
[9]

In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 , pages 142–147

Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 , pages 142–147. Association for Computational Lin- guistics. Antonio Toral, Elisa Noguera, Fernando Llopis, and Rafael Munoz. 2005. Improving question answer- ing using...

2003
[10]

Zero-Shot Open Entity Typing as Type-Compatible Grounding

Embedding methods for ﬁne grained entity type classiﬁcation. In Proceedings of the 53rd An- nual Meeting of the Association for Computational Linguistics and the 7th International Joint Confer- ence on Natural Language Processing (Volume 2: Short Papers), volume 2, pages 291–296. Mohamed Amir Yosef, Sandro Bauer, Johannes Hof- fart, Marc Spaniol, and Gerh...

2012
[11]

Step 1: After initial concepts extraction (§3.1)
[12]

Figure 4 shows a summary of the upper-bound curves for the output of the Step 1 and the Step

Step 2: After ELM O reranking (§3.2). Figure 4 shows a summary of the upper-bound curves for the output of the Step 1 and the Step
[13]

From the blue curve it is ev- ident that after about 200 concepts, there is al- most high-coverage

From these analysis we set the parameters ℓESA and ℓELM O. From the blue curve it is ev- ident that after about 200 concepts, there is al- most high-coverage. With this, we set the pa- rameterℓESA = 300 . Furthermore, the red curve shows the strong coverage, even with a few dozen candidates. In our experiments, we choose top ℓELM O = 20 concepts in the ou...
[14]

Inconsistent concept, due to context informa- tion: A short and ambiguous context could result in noisy concepts. In the following ex- ample, the majority of the selected concepts based on context are of type /politician, however the correct label is /event: [The Fellows Forum], concerned in part with the in- duction of newly elected fellows, is just one ...
[15]

Inconsistent concept, due to surface infor- mation: When a mention used in a sense other than its most-popular sense, it could re- sult in mistakes. In the following example, while from the context it is clear that “Utah” is a sports team, the surface-string has a stronger association with Colorado, which incorrectly results in the type /location: The big...
[16]

Videowatercolors

Incorrect type, due to type inference: even when the system is able to ﬁnd type- compatible concepts, it still fails to infer the correct type, if the types attached to the type-compatible are the majority among other types. In the following example, while there are concepts of type /person, the over- all decision is incorrectly dominated by other concept...
[17]

When he left the [Army], Spencer got a job in Boze- man, where he used acupuncture to save a dog that couldn’t walk anymore

Incorrect type, due to type deﬁnition: Some errors are caused by inaccurate deﬁnition for Figure 5: The output of each step in our system. the type mapping function T . the follow- ing example, the mention gets mapped to an approximately correct concept infantry , but the system fails to map it to the correct type due to the limitations of the type deﬁni-...

2012

[1] [1]

Context-Dependent Fine-Grained Entity Type Tagging

Fine-grained entity type classiﬁcation by jointly learning representations and label embed- dings. In Proceedings of the 15th Conference of the European Chapter of the Association for Computa- tional Linguistics: Volume 1, Long Papers , pages 797–807, Valencia, Spain. Association for Compu- tational Linguistics. Kurt D. Bollacker, Colin Evans, Praveen Par...

work page internal anchor Pith review Pith/arXiv arXiv 2008

[2] [2]

In HLT- NAACL

Ontonotes: The 90% solution. In HLT- NAACL. Lifu Huang, Jonathan May, Xiaoman Pan, and Heng Ji

[3] [3]

Building a Fine-Grained Entity Typing System Overnight for a New X (X = Language, Domain, Genre)

Building a ﬁne-grained entity typing system overnight for a new x (x= language, domain, genre). arXiv preprint arXiv:1603.03112. Daniel Khashabi, Mark Sammons, Ben Zhou, Tom Redman, Christos Christodoulopoulos, Vivek Sriku- mar, Nicholas Rizzolo, Lev Ratinov, Guanheng Luo, Quang Do, Chen-Tse Tsai, Subhro Roy, Stephen Mayhew, Zhilli Feng, John Wieting, Xia...

work page internal anchor Pith review Pith/arXiv arXiv 2018

[4] [4]

In Proceedings of the ninth conference on European chapter of the Association for Computational Lin- guistics, pages 1–8

Named entity recognition without gazetteers. In Proceedings of the ninth conference on European chapter of the Association for Computational Lin- guistics, pages 1–8. Association for Computational Linguistics. George Miller. 1995. Wordnet: a lexical database for english. Communications of the ACM , 38(11):39– 41. David Nadeau, Peter D Turney, and Stan Mat...

1995

[5] [5]

In Proceedings of the Australasian Language Technology Association Workshop 2008 , pages 124–132

Transforming wikipedia into named entity training data. In Proceedings of the Australasian Language Technology Association Workshop 2008 , pages 124–132. Joel Nothman, Tara Murphy, and James R Curran

2008

[6] [6]

In Proceedings of the 12th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 612–620

Analysing wikipedia and gold-standard cor- pora for ner training. In Proceedings of the 12th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 612–620. Association for Computational Linguistics. Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, and Tom M. Mitchell. 2009. Zero-shot learning with semantic output ...

2009

[7] [7]

In Proceedings of the International Conference on Machine Learning (ICML)

An embarrassingly simple approach to zero- shot learning. In Proceedings of the International Conference on Machine Learning (ICML). Dan Roth. 2017. Incidental supervision: Moving be- yond supervised learning. In Proc. of the Confer- ence on Artiﬁcial Intelligence (AAAI). Satoshi Sekine, Kiyoshi Sudo, and Chikashi Nobata

2017

[8] [8]

In Pro- ceedings of the Ninth International Conference on Language Resources and Evaluation (LREC)

Extended named entity hierarchy. In Pro- ceedings of the Ninth International Conference on Language Resources and Evaluation (LREC). Sonse Shimaoka, Pontus Stenetorp, Kentaro Inui, and Sebastian Riedel. 2017. Neural architectures for ﬁne-grained entity type classiﬁcation. In Proceed- ings of the 15th Conference of the European Chap- ter of the Association...

2017

[9] [9]

In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 , pages 142–147

Introduction to the conll-2003 shared task: Language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4 , pages 142–147. Association for Computational Lin- guistics. Antonio Toral, Elisa Noguera, Fernando Llopis, and Rafael Munoz. 2005. Improving question answer- ing using...

2003

[10] [10]

Zero-Shot Open Entity Typing as Type-Compatible Grounding

Embedding methods for ﬁne grained entity type classiﬁcation. In Proceedings of the 53rd An- nual Meeting of the Association for Computational Linguistics and the 7th International Joint Confer- ence on Natural Language Processing (Volume 2: Short Papers), volume 2, pages 291–296. Mohamed Amir Yosef, Sandro Bauer, Johannes Hof- fart, Marc Spaniol, and Gerh...

2012

[11] [11]

Step 1: After initial concepts extraction (§3.1)

[12] [12]

Figure 4 shows a summary of the upper-bound curves for the output of the Step 1 and the Step

Step 2: After ELM O reranking (§3.2). Figure 4 shows a summary of the upper-bound curves for the output of the Step 1 and the Step

[13] [13]

From the blue curve it is ev- ident that after about 200 concepts, there is al- most high-coverage

From these analysis we set the parameters ℓESA and ℓELM O. From the blue curve it is ev- ident that after about 200 concepts, there is al- most high-coverage. With this, we set the pa- rameterℓESA = 300 . Furthermore, the red curve shows the strong coverage, even with a few dozen candidates. In our experiments, we choose top ℓELM O = 20 concepts in the ou...

[14] [14]

Inconsistent concept, due to context informa- tion: A short and ambiguous context could result in noisy concepts. In the following ex- ample, the majority of the selected concepts based on context are of type /politician, however the correct label is /event: [The Fellows Forum], concerned in part with the in- duction of newly elected fellows, is just one ...

[15] [15]

Inconsistent concept, due to surface infor- mation: When a mention used in a sense other than its most-popular sense, it could re- sult in mistakes. In the following example, while from the context it is clear that “Utah” is a sports team, the surface-string has a stronger association with Colorado, which incorrectly results in the type /location: The big...

[16] [16]

Videowatercolors

Incorrect type, due to type inference: even when the system is able to ﬁnd type- compatible concepts, it still fails to infer the correct type, if the types attached to the type-compatible are the majority among other types. In the following example, while there are concepts of type /person, the over- all decision is incorrectly dominated by other concept...

[17] [17]

When he left the [Army], Spencer got a job in Boze- man, where he used acupuncture to save a dog that couldn’t walk anymore

Incorrect type, due to type deﬁnition: Some errors are caused by inaccurate deﬁnition for Figure 5: The output of each step in our system. the type mapping function T . the follow- ing example, the mention gets mapped to an approximately correct concept infantry , but the system fails to map it to the correct type due to the limitations of the type deﬁni-...

2012