Team Fusion@ SU@ BC8 SympTEMIST track: transformer-based approach for symptom recognition and linking

Georgi Grazhdanski; Ivan Koychev; Svetla Boytcheva; Sylvia Vassileva

arxiv: 2604.06424 · v1 · submitted 2026-04-07 · 💻 cs.CL · cs.AI

Team Fusion@ SU@ BC8 SympTEMIST track: transformer-based approach for symptom recognition and linking

Georgi Grazhdanski , Sylvia Vassileva , Ivan Koychev , Svetla Boytcheva This is my paper

Pith reviewed 2026-05-10 19:14 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords symptom recognitionentity linkingtransformerRoBERTaSapBERTknowledge basenamed entity recognitionmedical NLP

0 comments

The pith

The choice of knowledge base has the highest impact on accuracy for transformer-based symptom recognition and linking

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper describes a transformer-based pipeline for detecting symptoms in text and linking them to medical concepts. It fine-tunes a RoBERTa model with BiLSTM and CRF layers on an augmented dataset to recognize symptom mentions. Candidate links are generated using SapBERT embeddings and ranked by cosine similarity to entries in a knowledge base. The authors report that the selection of which knowledge base to use has the largest effect on the system's accuracy. Such a method could support better extraction of symptom data from electronic health records and medical literature.

Core claim

The presented approach fine-tunes a RoBERTa-based token-level classifier augmented with BiLSTM and CRF layers on an augmented training set for symptom named entity recognition. Entity linking is achieved by generating candidates with the cross-lingual SapBERT XLMR-Large model and computing cosine similarity against a knowledge base. The choice of knowledge base has the highest impact on model accuracy.

What carries the argument

SapBERT-based candidate generation followed by cosine similarity matching to a knowledge base, which controls the precision of symptom entity linking after initial recognition by the RoBERTa-BiLSTM-CRF model

If this is right

Augmenting the training set improves coverage for symptom named entity recognition.
Cosine similarity on SapBERT embeddings provides an effective way to rank knowledge base candidates for linking.
Transformer models can be adapted effectively for medical symptom tasks through fine-tuning and additional sequence layers.
The overall performance depends more on knowledge base selection than on other components of the pipeline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Curating or selecting appropriate medical knowledge bases may yield greater returns for clinical NLP applications than optimizing embedding models alone.
Similar pipelines could apply to linking other clinical entities like medications or diagnoses by using domain-specific knowledge bases.
Testing the approach on real-world clinical notes with varying symptom terminology would validate its robustness beyond the shared task data.

Load-bearing premise

The augmented training set sufficiently covers the distribution of symptoms in unseen test data and cosine similarity on SapBERT embeddings reliably identifies correct entity links.

What would settle it

Evaluating the model on test data containing symptoms not represented in the augmented training set or using a knowledge base with mismatched terminology would reveal if accuracy holds or declines.

read the original abstract

This paper presents a transformer-based approach to solving the SympTEMIST named entity recognition (NER) and entity linking (EL) tasks. For NER, we fine-tune a RoBERTa-based (1) token-level classifier with BiLSTM and CRF layers on an augmented train set. Entity linking is performed by generating candidates using the cross-lingual SapBERT XLMR-Large (2), and calculating cosine similarity against a knowledge base. The choice of knowledge base proves to have the highest impact on model accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A competent but conventional system paper for a medical shared task, with the KB choice finding being the main takeaway.

read the letter

This paper reports on a transformer pipeline for the SympTEMIST shared task, covering symptom named entity recognition and entity linking. For NER they fine-tune RoBERTa with added BiLSTM and CRF layers on an augmented training set. For linking they use SapBERT embeddings and pick the best match by cosine similarity against entries in a knowledge base. Their main observation is that the choice of knowledge base had the biggest effect on final accuracy. What the paper does well is lay out a clear, reproducible system that combines standard components and shows the practical value of data augmentation and careful KB selection in a medical domain. The finding on KB impact is consistent with how entity linking usually works, since coverage and alignment with the target vocabulary drive most of the performance. Nothing here is new in terms of methods or theory. It's an application of well-known techniques to a new dataset and task. The abstract gives no performance figures, which makes it hard to judge the strength of their results or the size of the KB effect without the full paper. The main soft spot is that we don't see ablations or error analysis here, so it's difficult to verify whether the augmentation actually helped or if the test data distribution was well covered. But the overall approach looks sound and free of obvious internal problems. This kind of work is useful for shared-task participants and as a reference baseline for medical NER/EL systems. Readers who need a starting point for similar problems will get value from the implementation details. It is worth sending to peer review for the track proceedings, as the system is competently described and the empirical claim holds up on its own terms.

Referee Report

1 major / 2 minor

Summary. The manuscript describes a transformer-based pipeline for the SympTEMIST shared task on symptom NER and entity linking. NER is performed by fine-tuning a RoBERTa model augmented with BiLSTM and CRF layers on an augmented training set. Entity linking generates candidates via cross-lingual SapBERT (XLMR-Large) embeddings and ranks them by cosine similarity against a knowledge base. The central empirical claim is that the choice of knowledge base produces the largest accuracy delta among the components tested during system development.

Significance. If the ablation-style comparisons are reproducible and the test-set results hold, the work supplies a concrete, task-specific demonstration that KB coverage and alignment dominate performance in biomedical symptom linking. This is consistent with broader EL literature and could usefully inform resource selection for similar low-resource medical NER/EL settings. The pipeline itself is standard; the value lies in the reported sensitivity ranking rather than architectural novelty.

major comments (1)

The abstract and system description assert that KB choice has the highest impact, yet no accuracy figures, delta values, or ablation table are supplied to quantify this ranking relative to other design choices (e.g., augmentation strategy or CRF layer). Without these numbers the central empirical claim cannot be verified or compared to prior SympTEMIST submissions.

minor comments (2)

The augmentation procedure for the training set is mentioned but not detailed (size, method, or coverage statistics); adding this information would improve reproducibility.
Consider including a brief error analysis or example of KB-induced linking failures to illustrate why the chosen KB outperforms alternatives.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the central claim requires explicit quantitative support and will revise the manuscript accordingly.

read point-by-point responses

Referee: The abstract and system description assert that KB choice has the highest impact, yet no accuracy figures, delta values, or ablation table are supplied to quantify this ranking relative to other design choices (e.g., augmentation strategy or CRF layer). Without these numbers the central empirical claim cannot be verified or compared to prior SympTEMIST submissions.

Authors: We acknowledge that the manuscript currently states the KB choice has the highest impact without providing the supporting ablation numbers or table. In the revised version we will add a dedicated ablation subsection (or table) that reports exact accuracy scores and deltas for the full pipeline versus variants that remove augmentation, remove the CRF layer, or swap knowledge bases. These numbers will directly substantiate the ranking and enable comparison with other SympTEMIST submissions. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper describes an empirical pipeline for NER (RoBERTa + BiLSTM-CRF on augmented data) and EL (SapBERT embeddings + cosine similarity to KB) in the SympTEMIST shared task. The central claim that KB choice has the highest impact on accuracy is presented as an observation from ablation-style comparisons during system development, with no equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations. No self-definitional loops, ansatzes smuggled via citation, or renaming of known results occur; the work is a standard applied description of model choices and empirical results, fully self-contained without reducing any claim to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are present; the paper is an applied system description without theoretical components or new postulated constructs.

pith-pipeline@v0.9.0 · 5391 in / 1006 out tokens · 40286 ms · 2026-05-10T19:14:46.973266+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

[1]

Almeida, R

T. Almeida, R. A. A. Jonker, R. Poudel, J. M. Silva, and S. Matos. Discovering med- ical procedures in spanish using transformer models with mcrf and augmentation. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023
[2]

Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario, 2021

Casimiro Pio Carrino, Jordi Armengol-Estapé, Asier Gutiérrez-Fandiño, Joan Llop- Palao, Marc Pàmies, Aitor Gonzalez-Agirre, and Marta Villegas. Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario, 2021

work page 2021
[3]

Al- fonso Ureña-López, and María Teresa Martín-Valdivia

Mariia Chizhikova, Jaime Collado-Montañez, Manuel Carlos Díaz-Galiano, L. Al- fonso Ureña-López, and María Teresa Martín-Valdivia. Coming a long way with pre-trained transformers and string matching techniques: Clinical procedure men- tion recognition and normalization. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023
[4]

Named entity recognition

Vijay Krishnan and Vignesh Ganapathy. Named entity recognition. In Named Entity Recognition, 2005

work page 2005
[5]

Clin-x-es: pre- trained language models and a study on cross-task transfer for concept extraction in the clinical domain

Lukas Lange, Heike Adel, Jannik Strötgen, and Dietrich Klakow. Clin-x-es: pre- trained language models and a study on cross-task transfer for concept extraction in the clinical domain. Bioinformatics, 38(12):3267–3274, apr 2022

work page 2022
[6]

Lima-López, E

S. Lima-López, E. Farré-Maduell, L. Gasco-Sánchez, Rodríguez-Miret, J., and M. Krallinger. Overview of the symptemist shared task at biocreative viii: detection and normalization of symptoms, signs and findings. In Proceedings of BioCreative VIII Workshop, 2023

work page 2023
[7]

Learning domain- specialised representations for cross-lingual biomedical entity linking

Fangyu Liu, Ivan Vulić, Anna Korhonen, and Nigel Collier. Learning domain- specialised representations for cross-lingual biomedical entity linking. In Proceedings of ACL-IJCNLP 2021 , pages 565–574, August 2021

work page 2021
[8]

Roberta: A robustly optimized bert pretraining approach, 2019

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach, 2019

work page 2019
[9]

Fusion @ bioasq medprocner: Transformer-based approach for procedure recognition and linking in spanish clinical text

Sylvia Vassileva, Georgi Grazhdanski, Svetla Boytcheva, and Ivan Koychev. Fusion @ bioasq medprocner: Transformer-based approach for procedure recognition and linking in spanish clinical text. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023
[10]

The mespen resource for english-spanish medical machine translation and terminologies: Census of parallel corpora, glossaries and term trans- lations

Marta Villegas, Ander Intxaurrondo, Aitor Gonzalez-Agirre, Montserrat Marimon, and Martin Krallinger. The mespen resource for english-spanish medical machine translation and terminologies: Census of parallel corpora, glossaries and term trans- lations. In LREC MultilingualBIO: Multilingual Biomedical Text Processing , pages 32–39. ELRA, 2018

work page 2018
[11]

Vicomtech at medprocner 2023: Transformers-based sequence-labelling and cross-encoding for entity detection and normalisation in spanish clinical texts

Elena Zotova, Aitor García-Pablos, Montse Cuadros, and German Rigau. Vicomtech at medprocner 2023: Transformers-based sequence-labelling and cross-encoding for entity detection and normalisation in spanish clinical texts. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023

[1] [1]

Almeida, R

T. Almeida, R. A. A. Jonker, R. Poudel, J. M. Silva, and S. Matos. Discovering med- ical procedures in spanish using transformer models with mcrf and augmentation. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023

[2] [2]

Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario, 2021

Casimiro Pio Carrino, Jordi Armengol-Estapé, Asier Gutiérrez-Fandiño, Joan Llop- Palao, Marc Pàmies, Aitor Gonzalez-Agirre, and Marta Villegas. Biomedical and clinical language models for spanish: On the benefits of domain-specific pretraining in a mid-resource scenario, 2021

work page 2021

[3] [3]

Al- fonso Ureña-López, and María Teresa Martín-Valdivia

Mariia Chizhikova, Jaime Collado-Montañez, Manuel Carlos Díaz-Galiano, L. Al- fonso Ureña-López, and María Teresa Martín-Valdivia. Coming a long way with pre-trained transformers and string matching techniques: Clinical procedure men- tion recognition and normalization. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023

[4] [4]

Named entity recognition

Vijay Krishnan and Vignesh Ganapathy. Named entity recognition. In Named Entity Recognition, 2005

work page 2005

[5] [5]

Clin-x-es: pre- trained language models and a study on cross-task transfer for concept extraction in the clinical domain

Lukas Lange, Heike Adel, Jannik Strötgen, and Dietrich Klakow. Clin-x-es: pre- trained language models and a study on cross-task transfer for concept extraction in the clinical domain. Bioinformatics, 38(12):3267–3274, apr 2022

work page 2022

[6] [6]

Lima-López, E

S. Lima-López, E. Farré-Maduell, L. Gasco-Sánchez, Rodríguez-Miret, J., and M. Krallinger. Overview of the symptemist shared task at biocreative viii: detection and normalization of symptoms, signs and findings. In Proceedings of BioCreative VIII Workshop, 2023

work page 2023

[7] [7]

Learning domain- specialised representations for cross-lingual biomedical entity linking

Fangyu Liu, Ivan Vulić, Anna Korhonen, and Nigel Collier. Learning domain- specialised representations for cross-lingual biomedical entity linking. In Proceedings of ACL-IJCNLP 2021 , pages 565–574, August 2021

work page 2021

[8] [8]

Roberta: A robustly optimized bert pretraining approach, 2019

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach, 2019

work page 2019

[9] [9]

Fusion @ bioasq medprocner: Transformer-based approach for procedure recognition and linking in spanish clinical text

Sylvia Vassileva, Georgi Grazhdanski, Svetla Boytcheva, and Ivan Koychev. Fusion @ bioasq medprocner: Transformer-based approach for procedure recognition and linking in spanish clinical text. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023

[10] [10]

The mespen resource for english-spanish medical machine translation and terminologies: Census of parallel corpora, glossaries and term trans- lations

Marta Villegas, Ander Intxaurrondo, Aitor Gonzalez-Agirre, Montserrat Marimon, and Martin Krallinger. The mespen resource for english-spanish medical machine translation and terminologies: Census of parallel corpora, glossaries and term trans- lations. In LREC MultilingualBIO: Multilingual Biomedical Text Processing , pages 32–39. ELRA, 2018

work page 2018

[11] [11]

Vicomtech at medprocner 2023: Transformers-based sequence-labelling and cross-encoding for entity detection and normalisation in spanish clinical texts

Elena Zotova, Aitor García-Pablos, Montse Cuadros, and German Rigau. Vicomtech at medprocner 2023: Transformers-based sequence-labelling and cross-encoding for entity detection and normalisation in spanish clinical texts. In Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum , 2023

work page 2023