DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates

Cl\'audio E. C. Campelo; David Eduardo Pereira; Klaywert Danillo Ferreira de Souza; Larissa Lucena Vasconcelos

arxiv: 2603.05459 · v1 · submitted 2026-03-05 · 💻 cs.CL · cs.DB

DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates

Klaywert Danillo Ferreira de Souza , David Eduardo Pereira , Cl\'audio E. C. Campelo , Larissa Lucena Vasconcelos This is my paper

Pith reviewed 2026-05-15 16:21 UTC · model grok-4.3

classification 💻 cs.CL cs.DB

keywords DEBISS corpusspoken debatesargument miningspeaker diarizationdebate corpusNLP annotationssemi-structured debates

0 comments

The pith

DEBISS is a corpus of spoken individual debates annotated for speech-to-text, speaker diarization, argument mining and debater quality assessment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the DEBISS corpus as a collection of spoken debates carried out by single participants in a semi-structured format. It supplies annotations across several standard NLP tasks including converting speech to text, identifying who is speaking, extracting arguments, and rating the quality of each debater. The work responds to the limited number of existing debate resources that cover the range of formats debates take in daily life, education, work and media. A reader would care because debates occur constantly in conversation and online, yet most current tools rely on written text or highly formal televised exchanges.

Core claim

The authors construct and release the DEBISS corpus, consisting of spoken and individual debates that incorporate semi-structured features together with annotations for speech-to-text transcription, speaker diarization, argument mining, and debater quality assessment.

What carries the argument

The DEBISS corpus, a set of recorded spoken debates by individuals with layered annotations for multiple NLP tasks.

If this is right

NLP systems can be trained to extract arguments directly from spoken debate recordings rather than from transcripts alone.
Speaker diarization models can be evaluated and improved using the explicit speaker labels provided in debate contexts.
Automated scoring of debater quality becomes feasible using the quality-assessment annotations as ground truth.
The corpus supports development of end-to-end pipelines that handle transcription, speaker tracking and argument analysis in one workflow.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The semi-structured format may allow researchers to isolate the effect of specific debate rules on argument quality or speaker behavior.
If the debates include varied topics, the corpus could serve as a test bed for domain-adaptation techniques in argument mining.
Release of the audio files alongside annotations opens the possibility of multimodal extensions once visual data are added.

Load-bearing premise

The selected debates represent typical real-world discussion patterns and the manual annotations for each task are consistent enough to serve as reliable training data for other researchers.

What would settle it

A controlled test in which models trained on the DEBISS annotations show no improvement over models trained on existing debate data when evaluated on a fresh collection of naturally occurring spoken debates.

read the original abstract

The process of debating is essential in our daily lives, whether in studying, work activities, simple everyday discussions, political debates on TV, or online discussions on social networks. The range of uses for debates is broad. Due to the diverse applications, structures, and formats of debates, developing corpora that account for these variations can be challenging, and the scarcity of debate corpora in the state of the art is notable. For this reason, the current research proposes the DEBISS corpus: a collection of spoken and individual debates with semi-structured features. With a broad range of NLP task annotations, such as speech-to-text, speaker diarization, argument mining, and debater quality assessment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a new debate corpus but leaves out all the details on how it was built or checked.

read the letter

The main takeaway is that this paper introduces the DEBISS corpus of spoken, individual, semi-structured debates annotated for speech-to-text, speaker diarization, argument mining, and debater quality assessment, yet it supplies none of the supporting details needed to evaluate the resource. The contribution is new in targeting a specific format that combines spoken delivery with semi-structured elements and multi-task labels. Existing debate corpora tend to be either written or highly formal, so this fills a noted gap for work in argument mining and spoken dialogue systems. It does a decent job of motivating the need for such data given the variety of debate formats in real life. The soft spots are clear and central. The description stops short of any numbers on scale, any account of how the debates were gathered or transcribed, any guidelines for the annotations, and any measures of consistency like inter-annotator agreement. Without those, there is no way to tell if the corpus actually supports the listed NLP tasks or if it can be reproduced. This is a problem for a resource paper. The central claim depends on the data being solid, but nothing is shown to back that up. The paper would mainly interest NLP groups focused on argument mining or multi-modal dialogue analysis who are looking for fresh debate material. A reader would get value only if the full methods section were present and convincing. I do not think this should go to peer review as is. The authors need to add the basic documentation on construction and quality before it can be seriously considered.

Referee Report

2 major / 1 minor

Summary. The paper proposes the DEBISS corpus: a collection of spoken, individual, semi-structured debates annotated for speech-to-text, speaker diarization, argument mining, and debater quality assessment, motivated by the scarcity of such resources in NLP.

Significance. If released with full documentation, the corpus would fill a gap in spoken and semi-structured debate data, enabling research on argument mining and spoken-language tasks that current written-only corpora cannot support.

major comments (2)

[Abstract] Abstract: no corpus size, total duration, number of debates, sourcing criteria, or participant details are stated, preventing any assessment of scale or representativeness.
[Data Collection and Annotation] Data collection and annotation sections: the manuscript supplies no protocol for recording, segmentation, annotation guidelines, inter-annotator agreement figures, or validation steps for any of the four claimed tasks, so the reliability of the annotations cannot be evaluated.

minor comments (1)

[Abstract] Abstract: the term 'individual debates' is ambiguous and should be defined relative to multi-speaker formats.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on the DEBISS corpus paper. We address each major comment below and will revise the manuscript to improve clarity and completeness.

read point-by-point responses

Referee: [Abstract] Abstract: no corpus size, total duration, number of debates, sourcing criteria, or participant details are stated, preventing any assessment of scale or representativeness.

Authors: We agree that the abstract omits these key details, which are necessary for immediate assessment of scale and representativeness. Although the main text describes the corpus composition, sourcing from university participants on predefined topics, and debater demographics, these were not summarized in the abstract. In the revised manuscript we will expand the abstract to state the number of debates, total audio duration, sourcing criteria, and participant details. revision: yes
Referee: [Data Collection and Annotation] Data collection and annotation sections: the manuscript supplies no protocol for recording, segmentation, annotation guidelines, inter-annotator agreement figures, or validation steps for any of the four claimed tasks, so the reliability of the annotations cannot be evaluated.

Authors: This is a fair criticism; the original manuscript indeed lacked sufficient methodological detail on these aspects. We will substantially expand the Data Collection and Annotation sections to include the recording protocol and equipment, segmentation procedures, full annotation guidelines for speech-to-text, speaker diarization, argument mining, and debater quality assessment, inter-annotator agreement metrics (e.g., percentage agreement and Cohen's kappa per task), and the validation steps used. These additions will allow readers to evaluate annotation reliability. revision: yes

Circularity Check

0 steps flagged

No circularity: paper introduces new corpus without derivations or self-referential reductions

full rationale

The paper's central contribution is the proposal of the DEBISS corpus itself, described as a collection of spoken debates with annotations for speech-to-text, speaker diarization, argument mining, and debater quality assessment. No equations, fitted parameters, predictions, or derivation chains appear in the abstract or described full text. The claim does not reduce to prior work by construction, self-citation, or renaming; it is a direct resource contribution. This is the expected non-finding for a corpus paper whose value rests on the data and annotations rather than on any mathematical or predictive step.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations or fitted parameters are involved; the paper contributes a new dataset whose value rests on the assumption that the described collection and annotation process is sound.

pith-pipeline@v0.9.0 · 5431 in / 956 out tokens · 51899 ms · 2026-05-15T16:21:47.450040+00:00 · methodology

DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)