arxiv: 2605.03706 · v1 · submitted 2026-05-05 · 💻 cs.CL · cs.AI

Recognition: unknown

SAM-NER: Semantic Archetype Mediation for Zero-Shot Named Entity Recognition

Ruichu Cai , Juntao Gan , Miao Mai , Zhifeng Hao , Boyan Xu

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:37 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords zero-shot named entity recognitionsemantic archetypescross-domain transferentity discoverysemantic calibrationschema shiftlarge language modelsinformation extraction

0 comments

The pith

SAM-NER inserts an intermediate domain-invariant archetype space to reduce semantic drift in zero-shot named entity recognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that zero-shot NER breaks down when unseen target labels fail to align with how an LLM internally organizes meaning, causing drift especially with novel or overlapping schemas. It introduces Semantic Archetype Mediation as a three-stage bridge: first discovering entity spans through cooperative extraction and consensus denoising, then projecting those spans into a compact set of universal archetypes drawn from high-level ontological abstractions, and finally calibrating the archetype predictions into specific target types via constrained inference on a frozen LLM. This structure is meant to create a stable, domain-invariant layer that improves transfer without retraining the model. A reader would care because direct prompting approaches remain brittle precisely on the cross-domain and cross-schema cases that matter most in practice.

Core claim

SAM-NER shows that routing entity mentions through an intermediate layer of universal semantic archetypes distilled from ontological abstractions, followed by definition-aligned calibration, produces more reliable zero-shot predictions than direct label mapping, as measured by consistent outperformance over prior baselines on the CrossNER benchmark in cross-domain settings.

What carries the argument

Semantic Archetype Mediation, an intermediate projection into a compact set of universal semantic archetypes that decouples entity discovery from final target-schema assignment.

If this is right

SAM-NER outperforms prior zero-shot NER baselines on CrossNER under cross-domain conditions.
The archetype mediation step reduces systematic semantic drift for unseen or overlapping target schemas.
The full pipeline operates with a frozen LLM and requires no task-specific fine-tuning.
Consensus-based entity discovery improves span coverage and fidelity before any label assignment occurs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same archetype-mediation pattern could be tested on zero-shot relation extraction or event extraction where schema drift is also common.
If the archetypes prove reusable across multiple extraction tasks, they might serve as a lightweight alignment layer for broader LLM-based information extraction pipelines.
A natural next measurement would be how sensitive final accuracy is to the particular choice or granularity of the ontological abstractions used to define the archetypes.

Load-bearing premise

An intermediate domain-invariant archetype space distilled from high-level ontological abstractions can be projected and calibrated without introducing new semantic misalignment when target schemas are novel or overlapping.

What would settle it

On a held-out cross-domain NER test set with deliberately novel or heavily overlapping label definitions, measure whether SAM-NER's F1 scores fall below those of a strong direct-LLM-prompting baseline.

Figures

Figures reproduced from arXiv: 2605.03706 by Boyan Xu, Juntao Gan, Miao Mai, Ruichu Cai, Zhifeng Hao.

**Figure 1.** Figure 1: Comparison of different ZS-NER methods LLM-based ZS-NER approaches can be broadly categorized into two paradigms: (i) instructionbased structured extraction via natural language constraints, and (ii) retrieval-augmented inference (RAG) that incorporates external evidence during generation or verification. Instruction-based methods reformulate entity specifications and task constraints into prompts that gu… view at source ↗

**Figure 2.** Figure 2: The overall framework. In Stage 1, extracted entities are used to annotate sentences. In Stage 2, archetype mediation are assigned to the marked entities. In Stage 3, the mediation types are refined into target domain types based on type definitions. This approach decomposes NER into multiple interactive agents and enhances robustness in zero-shot or low-resource scenarios through structured interactions… view at source ↗

**Figure 3.** Figure 3: Impact of CCR strategy on precision and Recall in different domains strategy effectively mitigates the spurious mention activations typical of explorer extractor trained on silver-standard data, which often misidentify generic functional terms as entity mentions. As illustrated in view at source ↗

**Figure 5.** Figure 5: Anchor Extractor Prompt. Guiding the model to identify and extract entities from text following predefined view at source ↗

**Figure 6.** Figure 6: Explorer Extractor Prompt. Guiding the model to identify and extract entities from text following view at source ↗

**Figure 7.** Figure 7: Archetype Classifier Prompt. Guiding the model to assign semantically broad types to annotated entities view at source ↗

**Figure 8.** Figure 8: Type Calibrator Prompt. Guiding the model to calibrate the abstract types assigned to entities to view at source ↗

read the original abstract

Zero-shot Named Entity Recognition (ZS-NER) remains brittle under domain and schema shifts, where unseen label definitions often misalign with a large language model's (LLM's) intrinsic semantic organization. As a result, directly mapping entity mentions to fine-grained target labels can induce systematic semantic drift, especially when target schemas are novel or semantically overlapping. We propose \textbf{SAM-NER}, a three-stage framework based on \emph{Semantic Archetype Mediation} that stabilizes cross-domain transfer through an intermediate, domain-invariant archetype space. SAM-NER: (i) performs \emph{Entity Discovery} via cooperative extraction and consensus-based denoising to obtain high-coverage, high-fidelity entity spans; (ii) conducts \emph{Abstract Mediation} by projecting entities into a compact set of universal semantic archetypes distilled from high-level ontological abstractions; and (iii) applies \emph{Semantic Calibration} to resolve archetype-level predictions into target-domain types through constrained, definition-aligned inference with a frozen LLM. Experiments on the CrossNER benchmark show that SAM-NER consistently outperforms strong prior ZS-NER baselines in cross-domain settings. Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/SAM-NER.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAM-NER adds an explicit archetype mediation stage to zero-shot NER to reduce semantic drift, but the abstract gives no numbers or checks to show the stage actually works.

read the letter

The main point is that this paper frames zero-shot NER as a three-stage process with an intermediate universal archetype space meant to sit between entity discovery and target label calibration. The archetype step is the part they present as new, pulling from high-level ontologies to create something domain-invariant before mapping to whatever schema the target domain uses. That separation is a reasonable way to think about the misalignment problem when labels overlap or are completely new. The entity discovery part with consensus denoising and the final constrained LLM inference also fit together logically as a pipeline. Open-sourcing the code is a plus for anyone who wants to test it. The description of why direct mapping fails is clear enough. The soft spots are in the support for the claims. The abstract states that SAM-NER beats prior baselines on CrossNER but gives no scores, no baseline names, no metrics, no ablations, and no error analysis. The stress-test concern about archetype stability is fair: nothing in the provided text measures whether the same archetypes are assigned consistently across domains or whether the projection step preserves meaning without adding its own drift. If those checks are missing from the full paper too, then the outperformance cannot be confidently tied to the mediation layer rather than the LLM or the discovery stage. The load-bearing assumption that the archetype space stays invariant under domain shift therefore stays untested in what is shown. This paper is for people working on practical zero-shot information extraction who need modular ways to handle schema changes. A reader already familiar with CrossNER and recent ZS-NER baselines could extract the architectural idea and decide whether to adapt it. It deserves peer review because the problem is real and the pipeline is coherent, even though the current evidence is too thin to judge the contribution yet. Referees can ask for the missing measurements and ablations.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes SAM-NER, a three-stage framework for zero-shot named entity recognition (ZS-NER) that stabilizes cross-domain transfer via an intermediate domain-invariant archetype space. The stages are (i) Entity Discovery through cooperative extraction and consensus denoising, (ii) Abstract Mediation by projecting entities into a compact set of universal semantic archetypes distilled from high-level ontological abstractions, and (iii) Semantic Calibration that resolves archetype predictions into target-domain types via constrained LLM inference. The central empirical claim is that SAM-NER consistently outperforms strong prior ZS-NER baselines on the CrossNER benchmark in cross-domain settings.

Significance. If the reported gains hold and can be attributed specifically to the archetype mediation layer rather than LLM priors or the discovery stage, the work would offer a structured approach to mitigating semantic drift under schema shifts by leveraging ontological abstractions as an invariant bridge. This could meaningfully extend existing ZS-NER methods that rely on direct mapping or prompt engineering, particularly for novel or overlapping target schemas.

major comments (3)

[§3.2] §3.2 (Abstract Mediation): The manuscript asserts that the archetype space is domain-invariant and distilled from high-level ontological abstractions, yet provides no quantitative verification of this invariance, such as archetype assignment entropy, Jaccard overlap of recovered archetypes, or projection fidelity metrics across the different CrossNER domains. Without such checks, it is unclear whether the reported outperformance stems from the claimed mediation or from other pipeline components.
[Experiments] Experiments section (and abstract): The central claim of consistent outperformance on CrossNER lacks reported details on baseline implementations, exact metrics (e.g., micro/macro F1), statistical significance tests, variance across runs, or ablation studies isolating the contribution of each stage. This makes it impossible to evaluate whether the three-stage pipeline is necessary or whether gains could arise from the Entity Discovery stage alone.
[§3.3] §3.3 (Semantic Calibration): The calibration step is described as resolving archetype predictions to novel/overlapping target schemas via definition-aligned inference, but the manuscript does not address or measure potential new semantic misalignment introduced during this projection, which is the load-bearing assumption for handling schema shifts.

minor comments (2)

The abstract states that the implementation will be open-sourced but provides no link or repository details in the main text; this should be added for reproducibility.
Notation for the archetype set and projection function is introduced without a clear mathematical definition or diagram; a formal notation section or figure would improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback, which highlights important areas for strengthening the validation and presentation of SAM-NER. We address each major comment below and will incorporate revisions to improve the manuscript's rigor and clarity.

read point-by-point responses

Referee: [§3.2] §3.2 (Abstract Mediation): The manuscript asserts that the archetype space is domain-invariant and distilled from high-level ontological abstractions, yet provides no quantitative verification of this invariance, such as archetype assignment entropy, Jaccard overlap of recovered archetypes, or projection fidelity metrics across the different CrossNER domains. Without such checks, it is unclear whether the reported outperformance stems from the claimed mediation or from other pipeline components.

Authors: We agree that quantitative verification of domain-invariance would better support the claims in §3.2. The archetype space is constructed from high-level ontological abstractions to serve as a domain-invariant bridge, but the original submission did not report explicit cross-domain metrics. In the revised manuscript, we will add analyses including archetype assignment entropy across CrossNER domains, Jaccard overlap of recovered archetypes, and projection fidelity metrics to demonstrate that the mediation layer contributes to performance beyond the entity discovery stage alone. revision: yes
Referee: [Experiments] Experiments section (and abstract): The central claim of consistent outperformance on CrossNER lacks reported details on baseline implementations, exact metrics (e.g., micro/macro F1), statistical significance tests, variance across runs, or ablation studies isolating the contribution of each stage. This makes it impossible to evaluate whether the three-stage pipeline is necessary or whether gains could arise from the Entity Discovery stage alone.

Authors: We acknowledge that greater experimental transparency is required to substantiate the central claims and isolate component contributions. The submitted version reports aggregate outperformance but omits details on baseline re-implementations, the precise metric (micro-F1), statistical tests, variance, and full ablations. We will expand the Experiments section and abstract to specify baseline implementations, confirm micro-F1 as the primary metric (with macro-F1 in the appendix), include statistical significance tests (e.g., paired t-tests), report standard deviations over multiple runs, and provide ablation studies that remove each stage individually to show the necessity of the full pipeline. revision: yes
Referee: [§3.3] §3.3 (Semantic Calibration): The calibration step is described as resolving archetype predictions to novel/overlapping target schemas via definition-aligned inference, but the manuscript does not address or measure potential new semantic misalignment introduced during this projection, which is the load-bearing assumption for handling schema shifts.

Authors: We recognize the value of explicitly addressing potential semantic misalignment introduced during calibration. The constrained definition-aligned inference is designed to resolve archetypes to target schemas while limiting drift, yet the original manuscript does not quantify or discuss this risk. In the revision, we will expand §3.3 with a dedicated discussion of the assumption and add empirical measurements, such as pre- and post-calibration embedding similarity or targeted error analysis on misalignment cases, to validate its effectiveness under schema shifts. revision: yes

Circularity Check

0 steps flagged

No circularity in SAM-NER's empirical framework

full rationale

The paper presents SAM-NER as a three-stage empirical pipeline (Entity Discovery, Abstract Mediation, Semantic Calibration) whose value is asserted through benchmark outperformance on CrossNER. No equations, fitted parameters, self-referential derivations, or load-bearing self-citations appear in the abstract or described method. The archetype space is introduced as an intermediate construct distilled from ontological abstractions, but its stability is not derived from prior results by construction; claims rest on external experimental comparison rather than reducing to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Based solely on the abstract, no explicit free parameters, mathematical axioms, or independently evidenced invented entities can be identified; the 'universal semantic archetypes' constitute a new conceptual construct whose construction details and validation are not provided.

invented entities (1)

universal semantic archetypes no independent evidence
purpose: Provide a compact, domain-invariant intermediate representation that bridges entity mentions to target labels
Introduced as distilled from high-level ontological abstractions; no independent evidence or falsifiable handle is stated in the abstract.

pith-pipeline@v0.9.0 · 5526 in / 1163 out tokens · 50764 ms · 2026-05-07T16:37:29.872287+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 12 canonical work pages · 1 internal anchor

[1]

2024 , eprint=

ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT , author=. 2024 , eprint=

2024
[2]

Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models

Xie, Tingyu and Li, Qi and Zhang, Yan and Liu, Zuozhu and Wang, Hongwei. Self-Improving for Zero-Shot Named Entity Recognition with Large Language Models. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers). 2024. doi:10.18653/v1/2024.naacl-short.49

work page doi:10.18653/v1/2024.naacl-short.49 2024
[3]

Oscar Sainz and Iker Garc. Go. The Twelfth International Conference on Learning Representations , year=
[4]

V erifi NER : Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models

Kim, Seoyeon and Seo, Kwangwook and Chae, Hyungjoo and Yeo, Jinyoung and Lee, Dongha. V erifi NER : Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2024. doi:10.18653/v1/2024.acl-long.134

work page doi:10.18653/v1/2024.acl-long.134 2024
[5]

arXiv preprint arXiv:2411.00451 , year=

Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model , author=. arXiv preprint arXiv:2411.00451 , year=

work page arXiv
[6]

Proceedings of the 31st International Conference on Computational Linguistics , pages=

Ruie: Retrieval-based unified information extraction using large language model , author=. Proceedings of the 31st International Conference on Computational Linguistics , pages=
[7]

Electronics , volume=

Enhancing Clinical Named Entity Recognition via Fine-Tuned BERT and Dictionary-Infused Retrieval-Augmented Generation , author=. Electronics , volume=. 2025 , publisher=

2025
[8]

Handling Missing Entities in Zero-Shot Named Entity Recognition: Integrated Recall and Retrieval Augmentation , author=. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers) , pages=

2025
[9]

Findings of the Association for Computational Linguistics: NAACL 2025 , pages=

OPENBIONER: Lightweight Open-Domain Biomedical Named Entity Recognition Through Entity Type Description , author=. Findings of the Association for Computational Linguistics: NAACL 2025 , pages=

2025
[10]

Findings of the Association for Computational Linguistics: ACL 2025 , pages=

ZeroNER: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions , author=. Findings of the Association for Computational Linguistics: ACL 2025 , pages=

2025
[11]

arXiv preprint arXiv:2407.01272 , year=

Show less, instruct more: Enriching prompts with definitions and guidelines for zero-shot ner , author=. arXiv preprint arXiv:2407.01272 , year=

work page arXiv
[12]

arXiv preprint arXiv:2304.08085

Instructuie: Multi-task instruction tuning for unified information extraction , author=. arXiv preprint arXiv:2304.08085 , year=

work page arXiv
[13]

arXiv preprint arXiv:2308.03279 , year=

Universalner: Targeted distillation from large language models for open named entity recognition , author=. arXiv preprint arXiv:2308.03279 , year=

work page arXiv
[14]

Bioinformatics , volume=

Advancing entity recognition in biomedicine via instruction tuning of large language models , author=. Bioinformatics , volume=. 2024 , publisher=

2024
[15]

Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

Schema-driven information extraction from heterogeneous tables , author=. Findings of the Association for Computational Linguistics: EMNLP 2024 , pages=

2024
[16]

CCF International Conference on Natural Language Processing and Chinese Computing , pages=

Retrieval-augmented code generation for universal information extraction , author=. CCF International Conference on Natural Language Processing and Chinese Computing , pages=. 2024 , organization=

2024
[17]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

Knowcoder: Coding structured knowledge into llms for universal information extraction , author=. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[18]

IEP ile: Unearthing Large Scale Schema-Conditioned Information Extraction Corpus

Gui, Honghao and Yuan, Lin and Ye, Hongbin and Zhang, Ningyu and Sun, Mengshu and Liang, Lei and Chen, Huajun. IEP ile: Unearthing Large Scale Schema-Conditioned Information Extraction Corpus. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2024. doi:10.18653/v1/2024.acl-short.13

work page doi:10.18653/v1/2024.acl-short.13 2024
[19]

Proceedings of the 31st International Conference on Computational Linguistics , pages=

Beyond boundaries: Learning a universal entity taxonomy across datasets and languages for open named entity recognition , author=. Proceedings of the 31st International Conference on Computational Linguistics , pages=
[20]

Proceedings of the AAAI conference on artificial intelligence , volume=

Crossner: Evaluating cross-domain named entity recognition , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[21]

GUIDEX : Guided Synthetic Data Generation for Zero-Shot Information Extraction

Fuente, Neil De La and Sainz, Oscar and Garc \'i a-Ferrero, Iker and Agirre, Eneko. GUIDEX : Guided Synthetic Data Generation for Zero-Shot Information Extraction. Findings of the Association for Computational Linguistics: ACL 2025. 2025. doi:10.18653/v1/2025.findings-acl.1245

work page doi:10.18653/v1/2025.findings-acl.1245 2025
[22]

, author=

Lora: Low-rank adaptation of large language models. , author=. ICLR , volume=
[23]

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Llamafactory: Unified efficient fine-tuning of 100+ language models , author=. arXiv preprint arXiv:2403.13372 , year=

work page internal anchor Pith review arXiv
[24]

2024 Conf

Zaratiana, Urchade and Tomeh, Nadi and Holat, Pierre and Charnois, Thierry. GL i NER : Generalist Model for Named Entity Recognition using Bidirectional Transformer. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). 2024. doi:10.18653/v1/2...

work page doi:10.18653/v1/2024.naacl-long.300 2024
[25]

Proceedings of the ACM on Web Conference 2025 , pages =

Wang, Zihan and Zhao, Ziqi and Lyu, Yougang and Chen, Zhumin and de Rijke, Maarten and Ren, Zhaochun , title =. Proceedings of the ACM on Web Conference 2025 , pages =. 2025 , isbn =. doi:10.1145/3696410.3714923 , abstract =

work page doi:10.1145/3696410.3714923 2025
[26]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

A Multi-Agent LLM Framework for Multi-Domain Low-Resource In-Context NER via Knowledge Retrieval, Disambiguation and Reflective Analysis , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[27]

arXiv preprint arXiv:2511.15211 , year=

OEMA: Ontology-Enhanced Multi-Agent Collaboration Framework for Zero-Shot Clinical Named Entity Recognition , author=. arXiv preprint arXiv:2511.15211 , year=

work page arXiv