An encoder-based model is fine-tuned for automatic term extraction on Italian waste management texts and reports balanced type-level and micro-level F1 scores in a shared task.
Improving Term Extraction with Terminological Resources
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we implemented exploits testified terms at different steps of the process: chunking, parsing and extraction of term candidates. Experiments reported here show that, using this method, more term candidates can be acquired with a higher level of reliability. We further describe the extraction process involving endogenous disambiguation implemented in the term extractor YaTeA.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Peacemaker at ATE-IT: Automatic term extraction from Italian text for waste management data using encoder model
An encoder-based model is fine-tuned for automatic term extraction on Italian waste management texts and reports balanced type-level and micro-level F1 scores in a shared task.