Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Christophe Servan; Estelle Maudet; Maureen de Seyssel; Oralie Cattan

arxiv: 1907.05790 · v1 · pith:MNRMC7LYnew · submitted 2019-07-06 · 💻 cs.CL · cs.IR· cs.LG· stat.ML

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Estelle Maudet , Oralie Cattan , Maureen de Seyssel , Christophe Servan This is my paper

Pith reviewed 2026-05-25 01:46 UTC · model grok-4.3

classification 💻 cs.CL cs.IRcs.LGstat.ML

keywords clinical casessemantic similarityinformation extractionFrench medical textlanguage modelsneural networkslinguistic analysisDEFT 2019

0 comments

The pith

Language models and hybrid neural-linguistic methods achieve encouraging accuracy on semantic similarity and information extraction from French clinical cases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reports Qwant Research participation in tasks 2 and 3 of the DEFT 2019 challenge focused on French clinical cases. For task 2, semantic similarity between cases and discussions, the authors apply language models and test the effects of different preprocessings and matching techniques. For task 3, information extraction, they compare an approach using only neural networks against one that adds linguistic analysis. The extraction system is described as delivering encouraging accuracy results on the challenge data.

Core claim

An information extraction system for French clinical cases, implemented in both a neural-network-only version and a version that incorporates linguistic analysis, yields encouraging accuracy on DEFT 2019 task 3, while language-model approaches are used to address semantic similarity matching in task 2.

What carries the argument

Language models for semantic similarity matching paired with dual neural-network and linguistic-analysis pipelines for information extraction.

Load-bearing premise

The DEFT 2019 clinical-case datasets and evaluation metrics are representative enough of real French medical text processing needs for the reported accuracy to indicate useful progress.

What would settle it

Accuracy measurements on an independent collection of French clinical cases and discussions outside the DEFT 2019 set that fall substantially below the reported levels.

read the original abstract

This paper reports on Qwant Research contribution to tasks 2 and 3 of the DEFT 2019's challenge, focusing on French clinical cases analysis. Task 2 is a task on semantic similarity between clinical cases and discussions. For this task, we propose an approach based on language models and evaluate the impact on the results of different preprocessings and matching techniques. For task 3, we have developed an information extraction system yielding very encouraging results accuracy-wise. We have experimented two different approaches, one based on the exclusive use of neural networks, the other based on a linguistic analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a standard shared-task system report applying off-the-shelf models to French clinical cases with no new methods or detailed results.

read the letter

This paper is Qwant Research's write-up of their DEFT 2019 submissions on French clinical cases. Task 2 covers semantic similarity between cases and discussions using language models plus different preprocessings and matching steps. Task 3 is information extraction, where they run one neural-only system and one that adds linguistic analysis, and they call the accuracy encouraging. That is the full extent of what is new: they took existing tools and ran them on the shared-task French data. The experiments on preprocessing effects for task 2 are a reasonable check, and comparing neural versus linguistic routes for extraction is a normal thing to try in a low-resource setting like French medical text. The paper does not claim any new architecture or derivation. The soft spot is that the abstract and description give no numbers, no baselines, no ablations, and no error analysis, so the reader cannot tell whether the results are competitive or practically useful. The work stays within the shared-task data and metrics without discussing how representative those are of actual clinical notes. This kind of report is mainly useful to people already tracking French clinical NLP benchmarks or the specific DEFT tasks. It does not contain enough substance or novelty to justify sending it out for serious peer review at a general venue. It belongs in the shared-task proceedings if anywhere.

Referee Report

1 major / 0 minor

Summary. The paper reports Qwant Research's participation in DEFT 2019 tasks 2 and 3 on French clinical case analysis. For task 2 (semantic similarity between clinical cases and discussions), the authors propose language-model approaches and evaluate the impact of different preprocessings and matching techniques. For task 3 (information extraction), they describe two systems—one based exclusively on neural networks and one on linguistic analysis—claiming that the developed IE system yielded very encouraging accuracy results.

Significance. If the accuracy claims for task 3 are substantiated, the work supplies a direct comparison of neural versus linguistic methods on French clinical text, which could serve as a useful reference point for shared-task participants and for French medical NLP more broadly. The task-2 experiments on preprocessing and matching choices may also help isolate which factors matter most for semantic similarity in this domain.

major comments (1)

[Abstract] Abstract: the central claim that the information extraction system 'yielded very encouraging results accuracy-wise' is presented without any numerical accuracy figures, baselines, error bars, ablation results, or statistical significance tests, preventing assessment of whether the results actually advance the state of the art.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript describing our participation in the DEFT 2019 shared task. We address the single major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the information extraction system 'yielded very encouraging results accuracy-wise' is presented without any numerical accuracy figures, baselines, error bars, ablation results, or statistical significance tests, preventing assessment of whether the results actually advance the state of the art.

Authors: We agree that the abstract would be strengthened by the inclusion of concrete numerical results to support the claim. The body of the paper reports the accuracy figures for both the neural and linguistic IE systems on task 3, along with comparisons to the other participating systems. To address the referee's concern, we will revise the abstract to include the key accuracy scores achieved by our best system and a brief indication of how it compared to the shared-task baselines and other submissions. This change will allow readers to immediately gauge the performance without needing to consult the full text. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a straightforward empirical system report on participation in the DEFT 2019 shared task. It describes two approaches (language-model matching for task 2; neural-network and linguistic IE for task 3) and reports accuracy on the provided benchmark data. No equations, parameter-fitting steps, derivations, or self-citations appear in the text; all claims are direct experimental outcomes on external data, so no load-bearing step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical models, free parameters, axioms, or invented entities are introduced; the work is an applied system description for a shared task.

pith-pipeline@v0.9.0 · 5641 in / 906 out tokens · 19657 ms · 2026-05-25T01:46:40.516495+00:00 · methodology

Qwant Research @DEFT 2019: Document matching and information retrieval using clinical cases

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)