LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

AMIAD); Christophe Servan (STL; Luc Pommeret (STL); Patrick Paroubek (STL); Sahar Ghannay (STL); Sophie Rosset (LISN; STL); Thomas Gerald (LISN)

arxiv: 2604.02866 · v1 · submitted 2026-04-03 · 💻 cs.CL

LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Luc Pommeret (STL) , Thomas Gerald (LISN) , Patrick Paroubek (STL) , Sahar Ghannay (STL) , Christophe Servan (STL , AMIAD) , Sophie Rosset (LISN , STL) This is my paper

Pith reviewed 2026-05-13 20:06 UTC · model grok-4.3

classification 💻 cs.CL

keywords atomic propositionstriplet extractionrelation extractionknowledge graphsmultilingual modelsweak extractorsLLM distillationproposition generation

0 comments

The pith

Breaking sentences into atomic propositions improves triplet extraction for weaker models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether first decomposing complex sentences into minimal, self-contained units of meaning called atomic propositions can make it easier to pull out structured subject-predicate-object triplets. It presents a compact multilingual model that produces these propositions and inserts them as an extra processing step before two different extraction systems run. Experiments on four standard datasets show clear gains in relation recall for smaller or rule-based extractors and better overall accuracy when the text is in multiple languages. Stronger large language models lose little when the propositions are used with a simple fallback rule. The result frames the propositions as a readable intermediate layer that existing tools can use rather than a competing method.

Core claim

Atomic propositions generated by the distilled MPropositionneur-V2 model function as an interpretable intermediate representation that raises relation recall and multilingual accuracy when supplied to weaker triplet extractors such as GLiREL, CoreNLP, and 0.6B-scale models, while a fallback combination rule prevents entity-recall losses in stronger generative models.

What carries the argument

Atomic propositions: minimal, semantically autonomous units of information produced by MPropositionneur-V2 that serve as an intermediate data structure between raw text and triplet extractors.

If this is right

Weaker extractors gain measurable improvements in relation recall when atomic propositions are supplied first.
Multilingual accuracy rises across the six languages covered by the proposition model.
Stronger large language models can retain most gains by falling back to direct extraction when needed.
Atomic propositions act as a complement that works alongside existing extractors rather than replacing them.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition step might lower the compute needed to build knowledge graphs from large text collections.
The intermediate structure could be reused for other structured extraction tasks such as event or fact checking.
Extending the proposition model to more languages would test whether the observed multilingual gains generalize.

Load-bearing premise

The propositions created by the model correctly capture sentence meaning and do not add errors that hurt later triplet extraction.

What would settle it

Direct comparison of triplet extraction scores on the same test sets with and without the generated propositions; if scores for weak extractors stay the same or drop, the benefit claim is false.

Figures

Figures reproduced from arXiv: 2604.02866 by AMIAD), Christophe Servan (STL, Luc Pommeret (STL), Patrick Paroubek (STL), Sahar Ghannay (STL), Sophie Rosset (LISN, STL), Thomas Gerald (LISN).

**Figure 3.** Figure 3: Prompt Template used for the stage 2. Input: "Marie Curie, a Polish-born physicist, won the Nobel Prize in Physics." Atomic Props: ["Marie Curie is a physicist.", "Marie Curie was born in Poland.", "Marie Curie won the Nobel Prize in Physics."] Parsed Triplets: (Marie Curie, occupation, physicist), (Marie Curie, birthplace, Poland), (Marie Curie, award, Nobel Prize in Physics) [PITH_FULL_IMAGE:figures/ful… view at source ↗

**Figure 4.** Figure 4: illustrates the input and the output of the entire pipeline. 5. Experimental Protocol 5.1. Propositioner We train a propositioner4 , i.e. a model that transforms a text input into a list of atomic propositions, via knowledge distillation (Hinton et al., 4Available here : https://huggingface.co/ Zual/MPropositionneur-V2 Extract all factual (subject, predicate, object) triples from the sentence. One triple… view at source ↗

**Figure 2.** Figure 2: Prompt Template used for the distillation [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 5.** Figure 5: The Knowledge Graph built with triplets extracted by GLiREL on the atomic propositions for the above sentence. 7. Conclusion In this work, we empirically demonstrate the benefits of the atomic proposition for triplet entity relation extraction. In particular, we show that decomposing documents or paragraphs into atoms helps retrieve the relation efficiently, improving performance on both the FewRel and… view at source ↗

read the original abstract

Knowledge Graph construction from natural language requires extracting structured triplets from complex, information-dense sentences. In this paper, we investigate if the decomposition of text into atomic propositions (minimal, semantically autonomous units of information) can improve the triplet extraction. We introduce MPropositionneur-V2, a small multilingual model covering six European languages trained by knowledge distillation from Qwen3-32B into a Qwen3-0.6B architecture, and we evaluate its integration into two extraction paradigms: entity-centric (GLiREL) and generative (Qwen3). Experiments on SMiLER, FewRel, DocRED and CaRB show that atomic propositions benefit weaker extractors (GLiREL, CoreNLP, 0.6B models), improving relation recall and, in the multilingual setting, overall accuracy. For stronger LLMs, a fallback combination strategy recovers entity recall losses while preserving the gains in relation extraction. These results show that atomic propositions are an interpretable intermediate data structure that complements extractors without replacing them.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Atomic propositions give a modest practical lift to weak triplet extractors on standard benchmarks, but the gains rest on unverified proposition quality.

read the letter

The main takeaway is that breaking sentences into atomic propositions helps weaker extractors like GLiREL, CoreNLP, and 0.6B models pull out more relations on SMiLER, FewRel, DocRED, and CaRB, with some multilingual accuracy gains as well. They distill this into MPropositionneur-V2, a 0.6B model trained from Qwen3-32B across six European languages, then plug the output into both entity-centric and generative pipelines. For stronger LLMs they add a fallback that keeps entity recall while holding the relation gains. This is the concrete new piece: a small, multilingual propositioner evaluated end-to-end in triplet extraction rather than just as a standalone decomposition step. The experiments are straightforward and use public data, which makes the reported recall improvements easy to check. The work stays scoped to showing the propositions complement existing extractors instead of claiming to replace them. The soft spot is the missing verification that the propositions actually preserve meaning. A distilled 0.6B model can drop details or introduce scope errors on complex or non-English sentences, and without human fidelity ratings, error analysis, or an ablation that swaps in noisy spans while keeping the same pipeline length, the lift could come from shorter inputs or the extra processing step rather than the atomic property itself. The abstract presents the gains as observed, but the support stays moderate until those controls appear. This is useful for people building knowledge-graph pipelines who need to get more out of smaller models. It is not a big theoretical shift, but the integration results are the kind of incremental evidence that can inform practical choices. I would send it to peer review. The pattern on public benchmarks is clear enough for referees to evaluate, and they can ask for the fidelity checks that would make the claim tighter.

Referee Report

2 major / 2 minor

Summary. The manuscript claims that decomposing text into atomic propositions using the distilled MPropositionneur-V2 model (0.6B parameters, multilingual across six European languages) improves triplet extraction, particularly boosting relation recall for weaker extractors such as GLiREL, CoreNLP, and 0.6B models on the SMiLER, FewRel, DocRED, and CaRB datasets; a fallback combination strategy is proposed to recover entity recall for stronger LLMs while retaining relation gains, positioning atomic propositions as an interpretable intermediate structure.

Significance. If the results hold after addressing the fidelity concerns, the work offers a practical demonstration that atomic propositions can enhance weaker relation extractors in knowledge-graph construction pipelines without replacing them, with value in multilingual settings and for resource-constrained models; the multi-dataset evaluation provides a solid empirical foundation for this intermediate-representation approach.

major comments (2)

[Abstract] Abstract: The central claim that atomic propositions improve performance for weak extractors (GLiREL, CoreNLP, 0.6B models) rests on the unverified assumption that MPropositionneur-V2 outputs faithfully capture sentence semantics; no human fidelity ratings, error analysis of omissions/hallucinations/scope errors, or ablation replacing propositions with noisy/random spans of equivalent length are reported, leaving open the possibility that gains arise from input simplification or pipeline length rather than the atomic-proposition property itself.
[Experiments] Experiments section: Reported gains in relation recall and multilingual accuracy on SMiLER, FewRel, DocRED, and CaRB lack accompanying statistical significance tests, error bars, or per-dataset variance measures, making it difficult to determine whether the improvements are robust or could be explained by random variation in the weak-extractor baselines.

minor comments (2)

[Methods] The description of the knowledge-distillation procedure from Qwen3-32B to Qwen3-0.6B would benefit from explicit listing of training hyperparameters, data sources, and any filtering steps applied to the teacher outputs.
[Evaluation] Notation for the two extraction paradigms (entity-centric vs. generative) and the fallback combination strategy should be introduced with a small diagram or pseudocode to improve clarity for readers unfamiliar with GLiREL.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects for strengthening the empirical claims. We agree that additional validation of proposition fidelity and statistical rigor would improve the manuscript and plan to incorporate these elements in the revision.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that atomic propositions improve performance for weak extractors (GLiREL, CoreNLP, 0.6B models) rests on the unverified assumption that MPropositionneur-V2 outputs faithfully capture sentence semantics; no human fidelity ratings, error analysis of omissions/hallucinations/scope errors, or ablation replacing propositions with noisy/random spans of equivalent length are reported, leaving open the possibility that gains arise from input simplification or pipeline length rather than the atomic-proposition property itself.

Authors: We acknowledge the absence of human fidelity ratings and the suggested ablation. The manuscript relies on downstream performance gains that are consistent across four datasets and multiple weak extractors, which would be unlikely if the propositions were merely random simplifications. In the revision we will add a targeted error analysis of proposition omissions, hallucinations and scope issues on a sample of sentences, together with an ablation that replaces propositions by random spans of matched length while keeping the same pipeline structure. revision: yes
Referee: [Experiments] Experiments section: Reported gains in relation recall and multilingual accuracy on SMiLER, FewRel, DocRED, and CaRB lack accompanying statistical significance tests, error bars, or per-dataset variance measures, making it difficult to determine whether the improvements are robust or could be explained by random variation in the weak-extractor baselines.

Authors: We agree that the current presentation lacks formal statistical support. In the revised version we will report standard deviations across multiple runs where applicable, include paired significance tests (e.g., McNemar or t-tests) for the key recall and accuracy deltas, and add error bars to the main result tables. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation on benchmarks with no derivations or self-referential reductions

full rationale

The paper reports experimental results from integrating MPropositionneur-V2 (distilled from Qwen3-32B) into GLiREL, CoreNLP and generative extractors, measuring recall/accuracy lifts on SMiLER, FewRel, DocRED and CaRB. No equations, fitted parameters renamed as predictions, uniqueness theorems, or ansatzes appear. Claims rest on observed performance deltas rather than any derivation chain that reduces to its own inputs. Self-citations are absent from the load-bearing steps. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific free parameters or axioms; model distillation and proposition generation are treated as standard techniques without new invented entities.

pith-pipeline@v0.9.0 · 5514 in / 1019 out tokens · 29404 ms · 2026-05-13T20:06:26.820081+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We rely on the formalism of Semantic Information Theory (Bar-Hillel and Carnap, 1953)... a proposition is atomic if and only if it is a clause in a Conjunctive Normal Form (CNF).
IndisputableMonolith/Cost/FunctionalEquation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Experiments on SMiLER, FewRel, DocRED and CaRB show that atomic propositions benefit weaker extractors (GLiREL, CoreNLP, 0.6B models), improving relation recall

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 2 internal anchors

[1]

LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Introduction The interpretability of Natural Language Process- ing (NLP) models is a requirement for applications like fact-checking and automated construction of Knowledge Graphs (KGs). While neural models have achieved state-of-the-art results, their internal mechanisms remain opaque. Most current explain- ability methods are post hoc, seeking to explai...

work page internal anchor Pith review Pith/arXiv arXiv 1953
[2]

Related Works For inference and/or information retrieval (Xiang et al., 2026), it is common to represent information as a Knowledge Graph (KG). While a wide range of KGs automatically extracted from many different sources is available, for instance, using wikidata taxonomy and entities (Waagmeester et al., 2020; Hassanzadeh, 2021), extracting KGs from nat...

work page 2026
[3]

proposed an entity-focused sentence sim- plification method to improve relation extraction. More recently, (Niklaus et al., 2016) introduced 2(subject, relation, object) a rule-based sentence simplification system that rewrites complex sentences into simpler sentences for Open Information Extraction. However, these approaches rely on syntactic rules or de...

work page 2016
[4]

hallucinates

Formal Framework The abstraction underlying our objectives is the theoretical atomic proposition. To enlighten the atomization process, we use the formalism of Se- mantic Information Theory (Bar-Hillel and Carnap, 1953). This formalism provides a strong under- standing of information in terms of signification and gives a criterion for cutting a propositio...

work page 1953
[5]

We define different stages to extract triplets, with a full pipeline depicted in Figure 1

Proposed Approach The aim of this paper is to show that triplet extrac- tion could benefit from atomic propositions. We define different stages to extract triplets, with a full pipeline depicted in Figure 1. The global pipeline consists of three stages:

work page
[6]

Atomization: The complex text is processed by MPropositionneur-V2. This model, dis- tilled from Qwen3-32B into a Qwen3-0.6B ar- chitecture, recursively splits the text until each proposition is stable and autonomous by using the prompt proposed in Figure 2. 3A world W is a determined assignment of truth val- ues for each atomic subformula of ϕ

work page
[7]

LLM prompting : We use the Qwen3-4B model to generate triplets directly from an at- omized chunk using a dedicated prompt (Fig- ure 3)

work page
[8]

As a baseline, we replace stage 2 with two sub- stages by using the Parsing and Triplet Extrac- tion as follows: 2.1

KG Building : Extracted triplets are aggre- gated into a Knowledge Graph, where nodes represent entities and edges represent rela- tions. As a baseline, we replace stage 2 with two sub- stages by using the Parsing and Triplet Extrac- tion as follows: 2.1. Parsing: Each atomic proposition is parsed (here using SpaCy or Stanza) to extract part- of-speech (P...

work page
[11]

OUTPUT FORMAT: Only a JSON array of strings

REPETITION: Repeat the subject in EACH sentence. OUTPUT FORMAT: Only a JSON array of strings. Title: title Content: content Output: Figure 2: Prompt Template used for the distillation of the propositioner used in stage 1. Figure 4 illustrates the input and the output of the entire pipeline

work page
[12]

Marie Curie, a Polish-born physicist, won the Nobel Prize in Physics

Experimental Protocol 5.1. Propositioner We train a propositioner4, i.e. a model that trans- forms a text input into a list of atomic propo- sitions, via knowledge distillation (Hinton et al., 4Available here : https://huggingface.co/ Zual/MPropositionneur-V2 Extract all factual (subject, predicate, object) triples from the sentence. One triple per line i...

work page 2015
[13]

Direct", where models try to extract the triplet directly from the raw text

Results and Analysis In this section we report and discuss the results of the designed experiments. We evaluate quan- titatively the propositioner for the triplet extraction method. By flattening the text, relations are made explicit, allowing the extractors to capture facts that would otherwise be missed in complex sentences. 6.1. Evaluation on Multiling...

work page
[14]

Conclusion In this work, we empirically demonstrate the ben- efits of the atomic proposition for triplet entity re- lation extraction. In particular, we show that de- composing documents or paragraphs into atoms helps retrieve the relation efficiently, improving per- formance on both the FewRel and SMiLER bench- marks. In addition, we observe that the com...

work page
[15]

To date, we have not com- pared propositions obtained with and without re- cursive refinement

Limitations One limitation of this study is the relevance of the recursive propositioner. To date, we have not com- pared propositions obtained with and without re- cursive refinement. The decision to use such an algorithm to refine atoms recursively was based on preliminary experiments in which we observed that some propositions were not atomic. In futur...

work page
[16]

Bibliographical References Y ehoshua Bar-Hillel and Rudolf Carnap. 1953. Se- mantic information. The British Journal for the Philosophy of Science, 4(14):147–157. Sangnie Bhardwaj, Samarth Aggarwal, and Mausam. 2019. CaRB: A crowdsourced bench- mark for open IE. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and ...

work page 1953
[17]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural net- work. ArXiv, abs/1503.02531. Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia D’amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sab- bir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Stef...

work page internal anchor Pith review Pith/arXiv arXiv 2021
[18]

Red fm: a filtered and multilingual relation extraction dataset. In Proc. of the 61st Annual Meeting of the Association for Computational Linguistics: ACL 2023, Toronto, Canada. Associ- ation for Computational Linguistics. Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. 2023...

work page 2023
[19]

He", "She

ZERO PRONOUNS: "He", "She", "They", "His", "Her", "Its", "This one" ARE FORBIDDEN. ALWAYS replace them with the full name of the entity

work page
[20]

CONTEXT: Each sentence must be readable alone without knowing its source

work page
[21]

OUTPUT FORMAT: Only a JSON array of strings

REPETITION: Repeat the subject in EACH sentence. OUTPUT FORMAT: Only a JSON array of strings. Title: {title} Content: {content} Output: B. Proofs for the Formal Grounding Definition 1 (Safe Cut). A formula’s cut ϕ in a formula ψ is safe if ψ is a sub-formula of ϕ and if I(ϕ) > I (ψ) Definition 2 (Bad cut). A cut of a formula ϕ in a formula ψ is bad if ψ i...

work page

[1] [1]

LLM-based Atomic Propositions help weak extractors: Evaluation of a Propositioner for triplet extraction

Introduction The interpretability of Natural Language Process- ing (NLP) models is a requirement for applications like fact-checking and automated construction of Knowledge Graphs (KGs). While neural models have achieved state-of-the-art results, their internal mechanisms remain opaque. Most current explain- ability methods are post hoc, seeking to explai...

work page internal anchor Pith review Pith/arXiv arXiv 1953

[2] [2]

Related Works For inference and/or information retrieval (Xiang et al., 2026), it is common to represent information as a Knowledge Graph (KG). While a wide range of KGs automatically extracted from many different sources is available, for instance, using wikidata taxonomy and entities (Waagmeester et al., 2020; Hassanzadeh, 2021), extracting KGs from nat...

work page 2026

[3] [3]

proposed an entity-focused sentence sim- plification method to improve relation extraction. More recently, (Niklaus et al., 2016) introduced 2(subject, relation, object) a rule-based sentence simplification system that rewrites complex sentences into simpler sentences for Open Information Extraction. However, these approaches rely on syntactic rules or de...

work page 2016

[4] [4]

hallucinates

Formal Framework The abstraction underlying our objectives is the theoretical atomic proposition. To enlighten the atomization process, we use the formalism of Se- mantic Information Theory (Bar-Hillel and Carnap, 1953). This formalism provides a strong under- standing of information in terms of signification and gives a criterion for cutting a propositio...

work page 1953

[5] [5]

We define different stages to extract triplets, with a full pipeline depicted in Figure 1

Proposed Approach The aim of this paper is to show that triplet extrac- tion could benefit from atomic propositions. We define different stages to extract triplets, with a full pipeline depicted in Figure 1. The global pipeline consists of three stages:

work page

[6] [6]

Atomization: The complex text is processed by MPropositionneur-V2. This model, dis- tilled from Qwen3-32B into a Qwen3-0.6B ar- chitecture, recursively splits the text until each proposition is stable and autonomous by using the prompt proposed in Figure 2. 3A world W is a determined assignment of truth val- ues for each atomic subformula of ϕ

work page

[7] [7]

LLM prompting : We use the Qwen3-4B model to generate triplets directly from an at- omized chunk using a dedicated prompt (Fig- ure 3)

work page

[8] [8]

As a baseline, we replace stage 2 with two sub- stages by using the Parsing and Triplet Extrac- tion as follows: 2.1

KG Building : Extracted triplets are aggre- gated into a Knowledge Graph, where nodes represent entities and edges represent rela- tions. As a baseline, we replace stage 2 with two sub- stages by using the Parsing and Triplet Extrac- tion as follows: 2.1. Parsing: Each atomic proposition is parsed (here using SpaCy or Stanza) to extract part- of-speech (P...

work page

[9] [11]

OUTPUT FORMAT: Only a JSON array of strings

REPETITION: Repeat the subject in EACH sentence. OUTPUT FORMAT: Only a JSON array of strings. Title: title Content: content Output: Figure 2: Prompt Template used for the distillation of the propositioner used in stage 1. Figure 4 illustrates the input and the output of the entire pipeline

work page

[10] [12]

Marie Curie, a Polish-born physicist, won the Nobel Prize in Physics

Experimental Protocol 5.1. Propositioner We train a propositioner4, i.e. a model that trans- forms a text input into a list of atomic propo- sitions, via knowledge distillation (Hinton et al., 4Available here : https://huggingface.co/ Zual/MPropositionneur-V2 Extract all factual (subject, predicate, object) triples from the sentence. One triple per line i...

work page 2015

[11] [13]

Direct", where models try to extract the triplet directly from the raw text

Results and Analysis In this section we report and discuss the results of the designed experiments. We evaluate quan- titatively the propositioner for the triplet extraction method. By flattening the text, relations are made explicit, allowing the extractors to capture facts that would otherwise be missed in complex sentences. 6.1. Evaluation on Multiling...

work page

[12] [14]

Conclusion In this work, we empirically demonstrate the ben- efits of the atomic proposition for triplet entity re- lation extraction. In particular, we show that de- composing documents or paragraphs into atoms helps retrieve the relation efficiently, improving per- formance on both the FewRel and SMiLER bench- marks. In addition, we observe that the com...

work page

[13] [15]

To date, we have not com- pared propositions obtained with and without re- cursive refinement

Limitations One limitation of this study is the relevance of the recursive propositioner. To date, we have not com- pared propositions obtained with and without re- cursive refinement. The decision to use such an algorithm to refine atoms recursively was based on preliminary experiments in which we observed that some propositions were not atomic. In futur...

work page

[14] [16]

Bibliographical References Y ehoshua Bar-Hillel and Rudolf Carnap. 1953. Se- mantic information. The British Journal for the Philosophy of Science, 4(14):147–157. Sangnie Bhardwaj, Samarth Aggarwal, and Mausam. 2019. CaRB: A crowdsourced bench- mark for open IE. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and ...

work page 1953

[15] [17]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural net- work. ArXiv, abs/1503.02531. Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia D’amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sab- bir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Stef...

work page internal anchor Pith review Pith/arXiv arXiv 2021

[16] [18]

Red fm: a filtered and multilingual relation extraction dataset. In Proc. of the 61st Annual Meeting of the Association for Computational Linguistics: ACL 2023, Toronto, Canada. Associ- ation for Computational Linguistics. Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. 2023...

work page 2023

[17] [19]

He", "She

ZERO PRONOUNS: "He", "She", "They", "His", "Her", "Its", "This one" ARE FORBIDDEN. ALWAYS replace them with the full name of the entity

work page

[18] [20]

CONTEXT: Each sentence must be readable alone without knowing its source

work page

[19] [21]

OUTPUT FORMAT: Only a JSON array of strings

REPETITION: Repeat the subject in EACH sentence. OUTPUT FORMAT: Only a JSON array of strings. Title: {title} Content: {content} Output: B. Proofs for the Formal Grounding Definition 1 (Safe Cut). A formula’s cut ϕ in a formula ψ is safe if ψ is a sub-formula of ϕ and if I(ϕ) > I (ψ) Definition 2 (Bad cut). A cut of a formula ϕ in a formula ψ is bad if ψ i...

work page