Grammar as a Behavioral Biometric: Using Cognitively Motivated Grammar Models for Authorship Verification

Andrea Nini; Lukas Graner; Oren Halvani; Shunichi Ishihara; Sophie Titze; Valerio Gherardi

arxiv: 2403.08462 · v3 · submitted 2024-03-13 · 💻 cs.CL · cs.LG

Grammar as a Behavioral Biometric: Using Cognitively Motivated Grammar Models for Authorship Verification

Andrea Nini , Oren Halvani , Lukas Graner , Sophie Titze , Valerio Gherardi , Shunichi Ishihara This is my paper

Pith reviewed 2026-05-24 02:56 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords authorship verificationcognitive linguisticsgrammar modelsbehavioral biometricslikelihood ratiotext forensicsdigital forensicsexplainable methods

0 comments

The pith

A cognitively motivated grammar model verifies authorship more accurately than neural networks by computing a likelihood ratio called LambdaG.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that modeling an author's grammar according to Cognitive Linguistics principles and computing LambdaG—the ratio of how likely a text is under that author's grammar versus a reference population's grammar—outperforms seven baseline methods, including neural network approaches, on twelve datasets. This method treats grammar as a behavioral biometric unique to individuals. A sympathetic reader would care because it supplies a simpler, more interpretable alternative to complex black-box systems for determining whether two texts share an author in digital forensics contexts. The approach also proves robust to minor changes in the reference population and yields visualizations that support explainability.

Core claim

LambdaG is defined as the ratio of the likelihood of a document given the candidate author's grammar model to the likelihood given a reference population's grammar model. When the grammar models follow Cognitive Linguistics principles, this ratio delivers superior authorship verification performance across twelve datasets relative to seven baselines that include neural network-based methods. The paper states that the performance advantage arises because the method aligns with theories predicting that a person's grammar functions as a behavioral biometric.

What carries the argument

LambdaG, the ratio of likelihoods of a document under a candidate author's grammar model versus a reference population grammar model; it quantifies how distinctively the text fits the candidate's grammar.

If this is right

Authorship verification in digital text forensics can rely on grammar models rather than high-complexity neural methods.
The method remains effective even when the reference population varies slightly in composition.
Interpretability improves because the grammar models support visualizations of verification decisions.
The technique rests on compatibility with Cognitive Linguistics predictions that grammar acts as a behavioral biometric.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Likelihood-ratio methods grounded in cognitive models of language could extend to verifying other stable individual traits in text beyond grammar.
The approach might be tested for robustness on very short documents or in languages with different grammatical structures.
Hybrid systems could combine LambdaG with non-grammar features while preserving the cognitive grounding.
The same modeling strategy might apply to related forensic tasks such as detecting text generated by language models.
keywords

Load-bearing premise

That cognitively motivated grammar models can be built to capture stable individual differences in authorship and that the resulting likelihood ratios validly indicate whether two texts share an author.

What would settle it

A new dataset or reference population composition where LambdaG fails to match or exceed the performance of the seven baselines, or where small reference-group changes cause large drops in accuracy.

Figures

Figures reproduced from arXiv: 2403.08462 by Andrea Nini, Lukas Graner, Oren Halvani, Shunichi Ishihara, Sophie Titze, Valerio Gherardi.

**Figure 2.** Figure 2: 95% Confidence Intervals for Accuracy. uncalibrated. This means that although higher values of λG do correctly correspond to Y-cases, the scale of variation does not reflect the expectations of a perfectly calibrated system, where λG = 0 means an inconclusive result, a positive value suggests a Y-case, and a negative value suggests an N-case. When λG is turned into ΛG by fitting a logistic regression on tr… view at source ↗

**Figure 3.** Figure 3: Variation in Accuracy depending on the number of repetitions, [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

**Figure 4.** Figure 4: The loss in Accuracy (top) and Cllr (bottom) results for cross-corpus comparison, i. e., evaluating on Base Corpus while using reference texts Dref from Reference Corpus. Diagonal bold values denote the original Accuracy and Cllr, respectively. Darker shades denote a greater loss in performance. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: The POSNoise algorithm. Details on the notation can be found in [53]. Authorship Verification constitutes a similarity detection problem in which the subject of the similarity determination is the language of the author rather than other document aspects such as the topic [53]. However, a large number of existing AV methods including [1, 23, 25, 29, 83, 84, 86, 95, 102] use features that are directly influ… view at source ↗

read the original abstract

Authorship Verification (AV) is a key area of research in digital text forensics, which addresses the fundamental question of whether two texts were written by the same person. Numerous computational approaches have been proposed over the last two decades in an attempt to address this challenge. However, existing AV methods often suffer from high complexity, low explainability and especially from a lack of clear scientific justification. We propose a simpler method based on modeling the grammar of an author following Cognitive Linguistics principles. These models are used to calculate $\lambda_G$ (LambdaG): the ratio of the likelihoods of a document given the candidate's grammar versus given a reference population's grammar. Our empirical evaluation, conducted on twelve datasets and compared against seven baseline methods, demonstrates that LambdaG achieves superior performance, including against several neural network-based AV methods. LambdaG is also robust to small variations in the composition of the reference population and provides interpretable visualizations, enhancing its explainability. We argue that its effectiveness is due to the method's compatibility with Cognitive Linguistics theories predicting that a person's grammar is a behavioral biometric.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LambdaG is a grammar-based AV method with broad empirical tests but thin methodological detail on model construction and stats.

read the letter

LambdaG uses cognitive grammar models to compute a likelihood ratio for authorship verification and reports better results than several neural baselines across twelve datasets. The formulation itself is new in this context, and the paper does a reasonable job showing robustness to reference population changes plus some visualizations for interpretability. That gives it a clearer scientific story than many black-box AV papers. The empirical scope is also wider than usual for this kind of work. The soft spots are exactly where the abstract leaves things thin. There is almost no information on how the grammar models are built in practice, what features they actually use, or whether any statistical tests were run on the performance differences. Without those details it is hard to judge whether the claimed superiority is stable or whether the cognitive-linguistics motivation is doing real explanatory work rather than just post-hoc framing. The central assumption that individual grammar differences act as a reliable behavioral biometric still needs more direct evidence. This paper is aimed at people in digital forensics and computational linguistics who want simpler, theory-linked alternatives to neural methods. A reader already working on linguistic features for attribution would get the most out of it. It deserves a serious referee because the idea is coherent and the test set is large enough to be worth checking, even though the current version would almost certainly need major revisions on methods and evaluation before publication.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes LambdaG, a method for authorship verification that constructs cognitively motivated grammar models for individual authors and computes the likelihood ratio λ_G of a document under the candidate grammar versus a reference population grammar. It reports that this approach outperforms seven baselines (including neural AV methods) across twelve datasets, is robust to small changes in the reference population, and provides interpretable visualizations, attributing effectiveness to compatibility with Cognitive Linguistics theories that treat grammar as a behavioral biometric.

Significance. If the reported empirical results hold under scrutiny, the work supplies a simpler, more explainable alternative to neural methods in digital text forensics while grounding the approach in established cognitive theories. The multi-dataset evaluation and explicit likelihood-ratio formulation are strengths that could support falsifiable follow-up work; the robustness claim to reference-population composition is also a concrete, testable contribution.

major comments (1)

[Experimental Evaluation] Experimental section: the central claim of superior performance is load-bearing, yet the manuscript provides insufficient detail on the precise train/test splits, statistical significance testing (e.g., paired t-tests or McNemar), and controls for genre or length confounds across the twelve datasets; without these the superiority result cannot be fully assessed.

minor comments (2)

[Method] Notation for λ_G and the reference-population grammar should be defined once in a dedicated subsection rather than introduced piecemeal.
[Results] Figure captions for the grammar visualizations should explicitly state the units on each axis and the exact subset of data used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the experimental evaluation. We address the single major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Experimental Evaluation] Experimental section: the central claim of superior performance is load-bearing, yet the manuscript provides insufficient detail on the precise train/test splits, statistical significance testing (e.g., paired t-tests or McNemar), and controls for genre or length confounds across the twelve datasets; without these the superiority result cannot be fully assessed.

Authors: We agree that the experimental section would benefit from greater explicitness to support reproducibility and allow full assessment of the performance claims. In the revised manuscript we will add a dedicated subsection that specifies the exact train/test splits (including any cross-validation folds or hold-out ratios) for each of the twelve datasets. We will also report the results of statistical significance tests, including paired t-tests on performance metrics across repeated runs and McNemar’s test for pairwise comparisons against each baseline. Finally, we will include additional analyses that control for text length and genre confounds, such as performance stratified by length bins and by dataset genre where the data permit. These revisions will directly address the concerns raised. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained and empirically evaluated

full rationale

The paper defines LambdaG explicitly as a likelihood ratio between a candidate grammar model and a reference population grammar, motivated by Cognitive Linguistics principles. Performance is assessed via direct empirical comparison on twelve datasets against seven external baselines (including neural methods), with no reduction of the central result to a fitted parameter renamed as prediction, self-citation chain, or definitional equivalence. The compatibility argument with biometric theories is presented as post-hoc interpretation rather than a load-bearing premise that forces the outcome. No quoted equations or steps exhibit the enumerated circular patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract provides limited information; the method assumes grammar models can be built and likelihoods computed, but specific free parameters and axioms cannot be fully enumerated without the full text.

free parameters (1)

Grammar model parameters
Likely the grammar models are fitted to author texts, but details unknown from abstract.

axioms (1)

domain assumption A person's grammar is unique and can be modeled probabilistically based on Cognitive Linguistics principles.
Central to building the author grammar and population grammar for the likelihood ratio.

pith-pipeline@v0.9.0 · 5740 in / 1177 out tokens · 32510 ms · 2026-05-24T02:56:44.801745+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define the most basic grammatical unit of analysis as a function token t ∈ TL ... Grammar Model G as a statistical model that generates a probability distribution over sentences ... n-gram models ... Kneser-Ney smoothing ... λG(tk|t<k) = (1/r) Σ log P(tk|t<k; GA)/P(tk|t<k; Gj)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Principle of Linguistic Individuality ... grammar is a behavioral biometric

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

138 extracted references · 138 canonical work pages · 1 internal anchor

[1]

A. O. Agbeyangi, O. Abegunde, and S. I. Eludiora. Authorship Verification of Yorùbá Blog Posts using Character N-grams. In2020 International Conference in Mathematics, Computer Engineering and Computer Science (ICMCECS), pages 1–6, March 2020

work page 2020
[2]

Colin Aitken, Charles E. H. Berger, John S. Buckleton, Christophe Champod, James Curran, A. P. Dawid, Ian W. Evett, Peter Gill, Joaquin Gonzalez-Rodriguez, Graham Jackson, Ate Kloosterman, Tina Lovelock, David Lucy, Pierre Margot, Louise McKenna, Didier Meuwly, Cedric Neumann, Niamh Nic Daeid, Anders Nordgaard, Roberto Puch-Solis, Birgitta Rasmusson, Mike...

work page 2011
[3]

Statistics and the Evaluation of Evidence for Forensic Scientists

Colin Aitken, Franco Taroni, and Silvia Bozza. Statistics and the Evaluation of Evidence for Forensic Scientists. Wiley, Chichester, 01 2021. DOI: 10.1002/9781119245438

work page doi:10.1002/9781119245438 2021
[4]

Al-Khatib and Juman K

Mahmoud A. Al-Khatib and Juman K. Al-qaoud. Authorship Verification of Opinion Articles in Online Newspapers Using the Idiolect of Author: A Comparative Study.Information, Communi- cation & Society, 24(11):1603–1621, 2021

work page 2021
[5]

BiBERT-AV: Enhancing Author- ship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM

Amirah Almutairi, BooJoong Kang, and Nawfal Al Hashimy. BiBERT-AV: Enhancing Author- ship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM. In Guojun Wang, Haozhe Wang, Geyong Min, Nektarios Georgalas, and Weizhi Meng, editors,Ubiquitous Security - Third International Conference, UbiSec 2023, Exeter, UK, November 1-3, 2023, Revised ...

work page 2023
[6]

K. A. Apoorva and S. Sangeetha. Deep Neural Network and Model-based Clustering Technique for Forensic Electronic Mail Author Attribution.SN Applied Sciences, 3(3):348, February 2021

work page 2021
[7]

The Apricity - A European Cultural Community.https://theapricity.com, 2018

The Apricity. The Apricity - A European Cultural Community.https://theapricity.com, 2018

work page 2018
[8]

Computational Forensic Authorship Analysis: Promises and Pitfalls

Shlomo Engelson Argamon. Computational Forensic Authorship Analysis: Promises and Pitfalls. Language and Law / Linguagem e Direito, 5(2):7–37, 2018

work page 2018
[9]

American Statistical Association Position on Statistical State- ments for Forensic Evidence

American Statistical Association. American Statistical Association Position on Statistical State- ments for Forensic Evidence. Technical report, American Statistical Association (ASA), 2019

work page 2019
[10]

Overview of PAN 2024: Multi-Author Writing Style Analy- sis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Author- ship Verification

Abinew Ali Ayele, Nikolay Babakov, Janek Bevendorff, Xavier Bonet Casals, Berta Chulvi, Daryna Dementieva, Ashaf Elnagar, Dayne Freitag, Maik Fröbe, Damir Korenčić, Maximilian Mayerl, Daniil Moskovskiy, Animesh Mukherjee, Alexander Panchenko, Martin Potthast, Francisco Rangel, Naquee Rizwan, Paolo Rosso, Florian Schneider, Alisa Smirnova, Efstathios Stama...

work page 2024
[11]

Outside the Cave of Shadows: Using Syntactic Annotation To Enhance Authorship Attribution

Harald Baayen, Hans Van Halteren, and Fiona Tweedie. Outside the Cave of Shadows: Using Syntactic Annotation To Enhance Authorship Attribution. Literary and Linguistic Computing, 11(3):121–132, 1996

work page 1996
[12]

Author Identification Using Multi-headed Recurrent Neural Networks

Douglas Bagnall. Author Identification Using Multi-headed Recurrent Neural Networks. In Cap- pellato et al. [28]

work page
[13]

Language Is a Complex Adaptive System: Position Paper.Language Learning, 59:1–26, 2009

Clay Beckner, Nick C Ellis, Richard Blythe, John Holland, Joan Bybee, Jinyun Ke, Morten H Chris- tiansen, Diane Larsen-Freeman, William Croft, and Tom Schoenemann. Language Is a Complex Adaptive System: Position Paper.Language Learning, 59:1–26, 2009. Citation Key: Beckner2009. 34

work page 2009
[14]

Bias Analysis and Mit- igation in the Evaluation of Authorship Verification

Janek Bevendorff, Matthias Hagen, Benno Stein, and Martin Potthast. Bias Analysis and Mit- igation in the Evaluation of Authorship Verification. In Anna Korhonen, David R. Traum, and Lluís Màrquez, editors,Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long...

work page 2019
[15]

Generalizing Unmasking for Short Texts

Janek Bevendorff, Benno Stein, Matthias Hagen, and Martin Potthast. Generalizing Unmasking for Short Texts. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 654–659, Minneapolis, Minnesota, June 2019. Association for Co...

work page 2019
[16]

Voight- Kampff

Janek Bevendorff, Matti Wiegmann, Jussi Karlgren, Luise Dürlich, Evangelia Gogoulou, Aarne Talman, Efstathios Stamatatos, Martin Potthast, and Benno Stein. Overview of the “Voight- Kampff” Generative AI Authorship Verification Task at PAN and ELOQUENT 2024. In Guglielmo Faggioli, Nicola Ferro, Petra Galuščáková, and Alba García Seco Herrera, editors,Worki...

work page 2024
[17]

José Nilo G. Binongo. Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution.CHANCE, 16(2):9–17, 2003

work page 2003
[18]

The Importance of Suppressing Domain Style in Authorship Analysis

Sebastian Bischoff, Niklas Deckers, Marcel Schliebs, Ben Thies, Matthias Hagen, Efstathios Sta- matatos, Benno Stein, and Martin Potthast. The Importance of Suppressing Domain Style in Authorship Analysis. CoRR, abs/2005.14714, 2020

work page arXiv 2005
[19]

Boenninghoff, Steffen Hessler, Dorothea Kolossa, and Robert M

Benedikt T. Boenninghoff, Steffen Hessler, Dorothea Kolossa, and Robert M. Nickel. Explainable Authorship Verification in Social Media via Attention-based Similarity Learning. In Chaitanya K. Baru, Jun Huan, Latifur Khan, Xiaohua Hu, Ronay Ak, Yuanyuan Tian, Roger S. Barga, Carlo Zaniolo, Kisung Lee, and Yanfang (Fanny) Ye, editors,2019 IEEE International...

work page 2019
[20]

Boenninghoff, Robert M

Benedikt T. Boenninghoff, Robert M. Nickel, Steffen Zeiler, and Dorothea Kolossa. Similarity Learning for Authorship Verification in Social Media. InIEEE International Conference on Acous- tics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019, pages 2457–2461. IEEE, 2019

work page 2019
[21]

Likelihood Ratios for Categorical Evidence: Comparison of LR Models Applied to Gunshot Residue Data.Law, Probability and Risk, 16(2-3):71–90, 09 2017

Annabel Bolck and Amalia Stamouli. Likelihood Ratios for Categorical Evidence: Comparison of LR Models Applied to Gunshot Residue Data.Law, Probability and Risk, 16(2-3):71–90, 09 2017

work page 2017
[22]

Different Likelihood Ratio Approaches To Evaluate the Strength of Evidence of MDMA Tablet Comparisons

Annabel Bolck, Céline Weyermann, Laurence Dujourdy, Pierre Esseiva, and Jorrit Van Den Berg. Different Likelihood Ratio Approaches To Evaluate the Strength of Evidence of MDMA Tablet Comparisons. Forensic Science International, 191(1-3):42–51, 10 2009

work page 2009
[23]

Authorship Verification for Short Messages Using Stylometry

Marcelo Luiz Brocardo, Issa Traoré, Sherif Saad, and Isaac Woungang. Authorship Verification for Short Messages Using Stylometry. InInternational Conference on Computer, Information and Telecommunication Systems, CITS 2013, Athens, Greece, May 7-8, 2013, pages 1–6. IEEE, 2013

work page 2013
[24]

Authorship Verification of E-Mail and Tweet Messages Applied for Continuous Authentication.Journal of Computer and System Sciences, 81(8):1429 – 1440, 2015

Marcelo Luiz Brocardo, Issa Traore, and Isaac Woungang. Authorship Verification of E-Mail and Tweet Messages Applied for Continuous Authentication.Journal of Computer and System Sciences, 81(8):1429 – 1440, 2015

work page 2015
[25]

Marcelo Luiz Brocardo, Issa Traoré, Isaac Woungang, and Mohammad S. Obaidat. Authorship Verification using Deep Belief Network Systems.Int. J. Communication Systems, 30(12), 2017

work page 2017
[26]

Application-Independent Evaluation of Speaker Detection

Niko Brümmer and Johan du Preez. Application-Independent Evaluation of Speaker Detection. Computer Speech & Language, 20(2):230–275, 2006. Odyssey 2004: The speaker and Language Recognition Workshop

work page 2006
[27]

Language, Usage and Cognition

Joan Bybee. Language, Usage and Cognition. Cambridge University Press, Cambridge, UK, 2010

work page 2010
[28]

Linda Cappellato, Nicola Ferro, Gareth J. F. Jones, and Eric San Juan, editors.Working Notes for CLEF 2015 Conference, Toulouse, France, September 8–11, 2015, volume 1391 ofCEUR Workshop Proceedings. CEUR-WS.org, 2015. 35

work page 2015
[29]

Authorship Verification, Average Similarity Analysis

Daniel Castro Castro, Yaritza Adame Arcia, María Pelaez Brioso, and Rafael Muñoz Guillena. Authorship Verification, Average Similarity Analysis. In Proceedings of the International Con- ference Recent Advances in Natural Language Processing, pages 84–90. INCOMA Ltd. Shoumen, BULGARIA, 2015

work page 2015
[30]

Catoggio, J

D. Catoggio, J. Bunford, D. Taylor, G. Wevers, K. Ballantyne, and R. Morgan. An Introduc- tory Guide to Evaluative Reporting in Forensic Science.Australian Journal of Forensic Sciences, 51(sup1):S247–S251, February 2019

work page 2019
[31]

C.E. Chaski. Empirical Evaluations of Language-Based Author Identification Techniques.Forensic Linguistics, 8(1):1–65, 2001

work page 2001
[32]

Chen and Joshua Goodman

Stanley F. Chen and Joshua Goodman. An Empirical Study of Smoothing Techniques for Language Modeling. In Aravind K. Joshi and Martha Palmer, editors,34th Annual Meeting of the Association for Computational Linguistics, 24-27 June 1996, University of California, Santa Cruz, California, USA, Proceedings, pages 310–318. Morgan Kaufmann Publishers / ACL, 1996

work page 1996
[33]

Chandramouli, and K

Xiaoling Chen, Peng Hao, R. Chandramouli, and K. P. Subbalakshmi. Authorship Similarity Detection from Email Messages. In Proceedings of the 7th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM’11, pages 375–386, Berlin, Heidelberg,

work page
[34]

Christiansen and Nick Chater

Morten H. Christiansen and Nick Chater. The Now-or-Never Bottleneck: A Fundamental Con- straint on Language. Behavioral and Brain Sciences, 39:e62, 04 2016. Publisher: Cambridge University Press Citation Key: Christiansen2016

work page 2016
[35]

All the News 2.0 — 2.7 Million News Articles and Essays from 27 American Publi- cations

Components. All the News 2.0 — 2.7 Million News Articles and Essays from 27 American Publi- cations. https://components.one/datasets/all-the-news-2-news-articles-dataset , 2017

work page 2017
[36]

UsingSubsampling To Estimate the Strength of Handwriting Evidence via Score-Based Likelihood Ratios.Forensic Science International, 216(1-3):146–157, 03 2012

LindaJ.Davis, ChristopherP.Saunders, AmandaHepler, andJoAnnBuscaglia. UsingSubsampling To Estimate the Strength of Handwriting Evidence via Score-Based Likelihood Ratios.Forensic Science International, 216(1-3):146–157, 03 2012

work page 2012
[37]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[38]

Steven H. H. Ding, Benjamin C. M. Fung, and Mourad Debbabi. A Visualizable Evidence-Driven Approach for Authorship Attribution.ACM Trans. Inf. Syst. Secur., 17(3), March 2015

work page 2015
[39]

Frequency in Language: Memory, Attention and Learning

Dagmar Divjak. Frequency in Language: Memory, Attention and Learning. Cambridge University Press, Cambridge, UK, 2019

work page 2019
[40]

Language as a Phenomenon of the Third Kind.Cognitive Linguistics, 31(2):213– 229, 2020

Ewa Dąbrowska. Language as a Phenomenon of the Third Kind.Cognitive Linguistics, 31(2):213– 229, 2020

work page 2020
[41]

Drygajlo, M

A. Drygajlo, M. Jessen, S. Gfrörer, I. Wagner, J. Vermeulen, T. Niemi, and Verlag für Polizeiwis- senschaft. Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition. Verlag für Polizeiwissenschaft, 2015

work page 2015
[42]

Authorship Verification for Hired Plagiarism Detection

Daniel Enriquez, Gage Christensen, Hayden Donovan, Jared Lam, Noah Wong, Sergiu Dascalu, David Feil-Seifer, and Emily Hand. Authorship Verification for Hired Plagiarism Detection. In Proceedings of the 9th International Conference on Applied Computing & Information Technology, ACIT ’22, page 19–24, New York, NY, USA, 2023. Association for Computing Machinery

work page 2023
[43]

ENFSI guideline for evaluative reporting in forensic science, 2015

European Network of Forensic Science Institutes. ENFSI guideline for evaluative reporting in forensic science, 2015. Version 3.0

work page 2015
[44]

Evett, G

I.w. Evett, G. Jackson, J.A. Lambert, and S. McCrossan. The Impact of the Principles of Evidence Interpretation on the Structure and Content of Statements.Science & Justice, 40(4):233–239, 10 2000

work page 2000
[45]

CEUR-WS.org, 2014

Pamela Forner, Roberto Navigli, Dan Tufis, and Nicola Ferro, editors.Working Notes for CLEF 2013 Conference, Valencia, Spain, September 23–26, 2013, volume 1179 ofCEUR Workshop Pro- ceedings. CEUR-WS.org, 2014. 36

work page 2013
[46]

Language Models and Fusion for Au- thorship Attribution.Information Processing & Management, 56(6):102061, 11 2019

Olga Fourkioti, Symeon Symeonidis, and Avi Arampatzis. Language Models and Fusion for Au- thorship Attribution.Information Processing & Management, 56(6):102061, 11 2019

work page 2019
[47]

Al-Khateeb, Martin Potthast, Zinnar Ghasem, Mitul Shukla, and Emma Short

Ingo Frommholz, Haider M. Al-Khateeb, Martin Potthast, Zinnar Ghasem, Mitul Shukla, and Emma Short. On Textual Analysis and Machine Learning for Cyberstalking Detection.Datenbank- Spektrum, 16(2):127–135, 2016

work page 2016
[48]

Lane, Steve Croker, Peter C-H

Fernand Gobet, Peter C.R. Lane, Steve Croker, Peter C-H. Cheng, Gary Jones, Iain Oliver, and Julian M Pine. Chunking Mechanisms in Human Learning.Trends in Cognitive Sciences, 5(6):236– 243, 2001. Citation Key: Gobet2001

work page 2001
[49]

Recent Trends in Digital Text Forensics and its Evaluation

Tim Gollub, Martin Potthast, Anna Beyer, Matthias Busse, Francisco Rangel, Paolo Rosso, Efs- tathios Stamatatos, and Benno Stein. Recent Trends in Digital Text Forensics and its Evaluation. In Pamela Forner, Henning Müller, Roberto Paredes, Paolo Rosso, and Benno Stein, editors,Infor- mation Access Evaluation. Multilinguality, Multimodality, and Visualiza...

work page 2013
[50]

Universal Dependencies and Author Attribution of Short Texts with Syntax Alone

Robert Gorman. Universal Dependencies and Author Attribution of Short Texts with Syntax Alone. Digital Humanities Quarterly, 16(2), 2022

work page 2022
[51]

Practice-Oriented Authorship Verification

Oren Halvani. Practice-Oriented Authorship Verification. PhD thesis, Technical University of Darmstadt, Germany, 2021

work page 2021
[52]

TextUnitLib: A Python Library That Allows Easy Extraction of a Variety of Text Units Within Texts.https://github.com/Halvani/TextUnitLib, 2024

Oren Halvani. TextUnitLib: A Python Library That Allows Easy Extraction of a Variety of Text Units Within Texts.https://github.com/Halvani/TextUnitLib, 2024

work page 2024
[53]

POSNoise: An Effective Countermeasure Against Topic Biases in Authorship Analysis

Oren Halvani and Lukas Graner. POSNoise: An Effective Countermeasure Against Topic Biases in Authorship Analysis. InProceedings of the 16th International Conference on Availability, Reliability and Security, ARES ’21, New York, NY, USA, 2021. Association for Computing Machinery

work page 2021
[54]

Cross-Domain Authorship Verification Based on Topic Agnostic Features

Oren Halvani, Lukas Graner, and Roey Regev. Cross-Domain Authorship Verification Based on Topic Agnostic Features. In Linda Cappellato, Carsten Eickhoff, Nicola Ferro, and Aurélie Névéol, editors,Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, September 22-25, 2020, volume 2696 ofCEUR Workshop Proceedings. C...

work page 2020
[55]

TAVeer: An Interpretable Topic-Agnostic Author- ship Verification Method

Oren Halvani, Lukas Graner, and Roey Regev. TAVeer: An Interpretable Topic-Agnostic Author- ship Verification Method. In Melanie Volkamer and Christian Wressnegger, editors,ARES 2020: The 15th International Conference on Availability, Reliability and Security, Virtual Event, Ireland, August 25-28, 2020, pages 41:1–41:10. ACM, 2020

work page 2020
[56]

Authorship Verification in the Absence of Explicit Features and Thresholds

Oren Halvani, Lukas Graner, and Inna Vogel. Authorship Verification in the Absence of Explicit Features and Thresholds. In Gabriella Pasi, Benjamin Piwowarski, Leif Azzopardi, and Allan Han- bury, editors,Advances in Information Retrieval, pages 454–465. Springer International Publishing, 2018

work page 2018
[57]

On the Usefulness of Compression Models for Authorship Verification

Oren Halvani, Christian Winter, and Lukas Graner. On the Usefulness of Compression Models for Authorship Verification. In Proceedings of the 12th International Conference on Availability, Reliability and Security, ARES ’17, pages 54:1–54:10, New York, NY, USA, 2017. ACM

work page 2017
[58]

Assessing the Applicability of Authorship VerificationMethods

Oren Halvani, Christian Winter, and Lukas Graner. Assessing the Applicability of Authorship VerificationMethods. InProceedings of the 14th International Conference on Availability, Reliability and Security, ARES 2019, Canterbury, UK, August 26-29, 2019, pages 38:1–38:10. ACM, 2019

work page 2019
[59]

Recognition of Compromised Accounts on Twitter

Rodrigo Augusto Igawa, Alex Marino Gonçalves de Almeida, Bruno Bogaz Zarpelão, and Sylvio Barbon. Recognition of Compromised Accounts on Twitter. In Sean W. M. Siqueira and Sérgio T. Carvalho, editors, Proceedings of the annual conference on Brazilian Symposium on Information Systems, Information Systems: A Computer Socio-Technical Perspective, SBSI 2015,...

work page 2015
[60]

Khan, Benjamin C

Farkhund Iqbal, Liaquat A. Khan, Benjamin C. M. Fung, and Mourad Debbabi. E-mail Authorship Verification for Forensic Investigation. In Sung Y. Shin, Sascha Ossowski, Michael Schumacher, Mathew J. Palakal, and Chih-Cheng Hung, editors,Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), Sierre, Switzerland, March 22-26, 2010, pages 1591–1598....

work page 2010
[61]

A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram

Shunichi Ishihara. A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram. In Diego Molla and David Martinez, editors,Proceedings of the Australasian Language Technology Association Workshop 2011, pages 47–56, Canberra, Australia, December 2011

work page 2011
[62]

Score-Based LikelihoodRatios forLinguisticTextEvidence With aBag-of-Words Model

ShunichiIshihara. Score-Based LikelihoodRatios forLinguisticTextEvidence With aBag-of-Words Model. Forensic Science International, 327:110980, 2021. Publisher: Elsevier

work page 2021
[63]

Weight of Authorship Evidence With Multiple Categories of Stylometric Fea- tures: A Multinomial-Based Discrete Model.Science & Justice, 63(2):181–199, March 2023

Shunichi Ishihara. Weight of Authorship Evidence With Multiple Categories of Stylometric Fea- tures: A Multinomial-Based Discrete Model.Science & Justice, 63(2):181–199, March 2023

work page 2023
[64]

Likelihood Ratio Estimation for Authorship Text Evidence: An Empirical Comparison of Score- and Feature-Based Methods.Forensic Science International, 334:111268, May 2022

Shunichi Ishihara and Michael Carne. Likelihood Ratio Estimation for Authorship Text Evidence: An Empirical Comparison of Score- and Feature-Based Methods.Forensic Science International, 334:111268, May 2022

work page 2022
[65]

Validation in Forensic Text Comparison: Issues and Opportunities.Languages, 9(2):47, February 2024

Shunichi Ishihara, Sonia Kulkarni, Michael Carne, Sabine Ehrhardt, and Andrea Nini. Validation in Forensic Text Comparison: Issues and Opportunities.Languages, 9(2):47, February 2024. Number: 2 Publisher: Multidisciplinary Digital Publishing Institute

work page 2024
[66]

Estimating the Strength of Authorship Evidence With a Deep-Learning-Based Approach

Shunichi Ishihara, Satoru Tsuge, Mitsuyuki Inaba, and Wataru Zaitsu. Estimating the Strength of Authorship Evidence With a Deep-Learning-Based Approach. In Pradeesh Parameswaran, Jennifer Biggs, and David Powers, editors,Proceedings of the The 20th Annual Workshop of the Australasian Language Technology Association, pages183–187, Adelaide, Australia, Dece...

work page
[67]

Authorship Verifica- tion Applied to Detection of Compromised Accounts on Online Social Networks – A Continuous Approach

Sylvio Barbon Junior, Rodrigo Augusto Igawa, and Bruno Bogaz Zarpelão. Authorship Verifica- tion Applied to Detection of Compromised Accounts on Online Social Networks – A Continuous Approach. Multim. Tools Appl., 76(3):3213–3233, 2017

work page 2017
[68]

Overview of the Author Identification Task at PAN 2013

Patrick Juola and Efstathios Stamatatos. Overview of the Author Identification Task at PAN 2013. In Forner et al. [45]

work page 2013
[69]

Function Words in Authorship Attribution

Mike Kestemont. Function Words in Authorship Attribution. From Black Magic to Theory? In Anna Feldman, Anna Kazantseva, and Stan Szpakowicz, editors,Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL),pages59–66, Gothenburg, Sweden, April2014. Association for Computational Linguistics

work page
[70]

Overview of the Cross-Domain Authorship Verifi- cation Task at PAN 2020

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Martin Potthast, and Benno Stein. Overview of the Cross-Domain Authorship Verifi- cation Task at PAN 2020. In Linda Cappellato, Carsten Eickhoff, Nicola Ferro, and Aurélie Névéol, editors,Working Notes of CLEF 2020 - Conference and Labs of the Evaluat...

work page 2020
[71]

Overview of the Cross-Domain Authorship Veri- fication Task at PAN 2021

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Benno Stein, and Martin Potthast. Overview of the Cross-Domain Authorship Veri- fication Task at PAN 2021. In Guglielmo Faggioli, Nicola Ferro, Alexis Joly, Maria Maistro, and Florina Piroi, editors,Working Notes Papers of the CLEF 2021 Evaluation Lab...

work page 2021
[72]

Authenticating the Writings of Julius Caesar.Expert Syst

Mike Kestemont, Justin Anthony Stover, Moshe Koppel, Folgert Karsdorp, and Walter Daelemans. Authenticating the Writings of Julius Caesar.Expert Syst. Appl., 63:86–96, 2016

work page 2016
[73]

A Slightly-Modified GI-Based Author-Verifier with Lots of Features (ASGALF)

Mahmoud Khonji and Youssef Iraqi. A Slightly-Modified GI-Based Author-Verifier with Lots of Features (ASGALF). In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014., volume 1180 of CEUR Workshop Proceedings, pages 977–983. CEUR-WS.org, 2014

work page 2014
[74]

Improved Score Aggregation for Authorship Verification

Mahmoud Khonji, Youssef Iraqi, and Loubna Mekouar. Improved Score Aggregation for Authorship Verification. Knowledge and Information Systems, 12 2022

work page 2022
[75]

The Enron Corpus: A New Dataset for Email Classification Research

Bryan Klimt and Yiming Yang. The Enron Corpus: A New Dataset for Email Classification Research. In Jean-François Boulicaut, Floriana Esposito, Fosca Giannotti, and Dino Pedreschi, editors, Machine Learning: ECML 2004, pages 217–226, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. 38

work page 2004
[76]

Kneser and H

R. Kneser and H. Ney. Improved Backing-off for M-gram Language Modeling. In1995 International Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 181–184 vol.1, 1995

work page 1995
[77]

UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015

Mirco Kocher and Jacques Savoy. UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015. In CLEF (Working Notes), volume 1391 ofCEUR Workshop Proceedings. CEUR-WS.org, 2015

work page 2015
[78]

A Simple and Efficient Algorithm for Authorship Verification

Mirco Kocher and Jacques Savoy. A Simple and Efficient Algorithm for Authorship Verification. Journal of the Association for Information Science and Technology, 68(1):259–269, 2017

work page 2017
[79]

Authorship Verification as a One-Class Classification Prob- lem

Moshe Koppel and Jonathan Schler. Authorship Verification as a One-Class Classification Prob- lem. In Carla E. Brodley, editor,Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004, volume 69 ofACM International Conference Proceeding Series. ACM, 2004

work page 2004
[80]

Authorship Attribution in the Wild.Lan- guage Resources and Evaluation, 45(1):83–94, 2011

Moshe Koppel, Jonathan Schler, and Shlomo Argamon. Authorship Attribution in the Wild.Lan- guage Resources and Evaluation, 45(1):83–94, 2011

work page 2011

Showing first 80 references.

[1] [1]

A. O. Agbeyangi, O. Abegunde, and S. I. Eludiora. Authorship Verification of Yorùbá Blog Posts using Character N-grams. In2020 International Conference in Mathematics, Computer Engineering and Computer Science (ICMCECS), pages 1–6, March 2020

work page 2020

[2] [2]

Colin Aitken, Charles E. H. Berger, John S. Buckleton, Christophe Champod, James Curran, A. P. Dawid, Ian W. Evett, Peter Gill, Joaquin Gonzalez-Rodriguez, Graham Jackson, Ate Kloosterman, Tina Lovelock, David Lucy, Pierre Margot, Louise McKenna, Didier Meuwly, Cedric Neumann, Niamh Nic Daeid, Anders Nordgaard, Roberto Puch-Solis, Birgitta Rasmusson, Mike...

work page 2011

[3] [3]

Statistics and the Evaluation of Evidence for Forensic Scientists

Colin Aitken, Franco Taroni, and Silvia Bozza. Statistics and the Evaluation of Evidence for Forensic Scientists. Wiley, Chichester, 01 2021. DOI: 10.1002/9781119245438

work page doi:10.1002/9781119245438 2021

[4] [4]

Al-Khatib and Juman K

Mahmoud A. Al-Khatib and Juman K. Al-qaoud. Authorship Verification of Opinion Articles in Online Newspapers Using the Idiolect of Author: A Comparative Study.Information, Communi- cation & Society, 24(11):1603–1621, 2021

work page 2021

[5] [5]

BiBERT-AV: Enhancing Author- ship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM

Amirah Almutairi, BooJoong Kang, and Nawfal Al Hashimy. BiBERT-AV: Enhancing Author- ship Verification Through Siamese Networks with Pre-trained BERT and Bi-LSTM. In Guojun Wang, Haozhe Wang, Geyong Min, Nektarios Georgalas, and Weizhi Meng, editors,Ubiquitous Security - Third International Conference, UbiSec 2023, Exeter, UK, November 1-3, 2023, Revised ...

work page 2023

[6] [6]

K. A. Apoorva and S. Sangeetha. Deep Neural Network and Model-based Clustering Technique for Forensic Electronic Mail Author Attribution.SN Applied Sciences, 3(3):348, February 2021

work page 2021

[7] [7]

The Apricity - A European Cultural Community.https://theapricity.com, 2018

The Apricity. The Apricity - A European Cultural Community.https://theapricity.com, 2018

work page 2018

[8] [8]

Computational Forensic Authorship Analysis: Promises and Pitfalls

Shlomo Engelson Argamon. Computational Forensic Authorship Analysis: Promises and Pitfalls. Language and Law / Linguagem e Direito, 5(2):7–37, 2018

work page 2018

[9] [9]

American Statistical Association Position on Statistical State- ments for Forensic Evidence

American Statistical Association. American Statistical Association Position on Statistical State- ments for Forensic Evidence. Technical report, American Statistical Association (ASA), 2019

work page 2019

[10] [10]

Overview of PAN 2024: Multi-Author Writing Style Analy- sis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Author- ship Verification

Abinew Ali Ayele, Nikolay Babakov, Janek Bevendorff, Xavier Bonet Casals, Berta Chulvi, Daryna Dementieva, Ashaf Elnagar, Dayne Freitag, Maik Fröbe, Damir Korenčić, Maximilian Mayerl, Daniil Moskovskiy, Animesh Mukherjee, Alexander Panchenko, Martin Potthast, Francisco Rangel, Naquee Rizwan, Paolo Rosso, Florian Schneider, Alisa Smirnova, Efstathios Stama...

work page 2024

[11] [11]

Outside the Cave of Shadows: Using Syntactic Annotation To Enhance Authorship Attribution

Harald Baayen, Hans Van Halteren, and Fiona Tweedie. Outside the Cave of Shadows: Using Syntactic Annotation To Enhance Authorship Attribution. Literary and Linguistic Computing, 11(3):121–132, 1996

work page 1996

[12] [12]

Author Identification Using Multi-headed Recurrent Neural Networks

Douglas Bagnall. Author Identification Using Multi-headed Recurrent Neural Networks. In Cap- pellato et al. [28]

work page

[13] [13]

Language Is a Complex Adaptive System: Position Paper.Language Learning, 59:1–26, 2009

Clay Beckner, Nick C Ellis, Richard Blythe, John Holland, Joan Bybee, Jinyun Ke, Morten H Chris- tiansen, Diane Larsen-Freeman, William Croft, and Tom Schoenemann. Language Is a Complex Adaptive System: Position Paper.Language Learning, 59:1–26, 2009. Citation Key: Beckner2009. 34

work page 2009

[14] [14]

Bias Analysis and Mit- igation in the Evaluation of Authorship Verification

Janek Bevendorff, Matthias Hagen, Benno Stein, and Martin Potthast. Bias Analysis and Mit- igation in the Evaluation of Authorship Verification. In Anna Korhonen, David R. Traum, and Lluís Màrquez, editors,Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long...

work page 2019

[15] [15]

Generalizing Unmasking for Short Texts

Janek Bevendorff, Benno Stein, Matthias Hagen, and Martin Potthast. Generalizing Unmasking for Short Texts. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 654–659, Minneapolis, Minnesota, June 2019. Association for Co...

work page 2019

[16] [16]

Voight- Kampff

Janek Bevendorff, Matti Wiegmann, Jussi Karlgren, Luise Dürlich, Evangelia Gogoulou, Aarne Talman, Efstathios Stamatatos, Martin Potthast, and Benno Stein. Overview of the “Voight- Kampff” Generative AI Authorship Verification Task at PAN and ELOQUENT 2024. In Guglielmo Faggioli, Nicola Ferro, Petra Galuščáková, and Alba García Seco Herrera, editors,Worki...

work page 2024

[17] [17]

José Nilo G. Binongo. Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution.CHANCE, 16(2):9–17, 2003

work page 2003

[18] [18]

The Importance of Suppressing Domain Style in Authorship Analysis

Sebastian Bischoff, Niklas Deckers, Marcel Schliebs, Ben Thies, Matthias Hagen, Efstathios Sta- matatos, Benno Stein, and Martin Potthast. The Importance of Suppressing Domain Style in Authorship Analysis. CoRR, abs/2005.14714, 2020

work page arXiv 2005

[19] [19]

Boenninghoff, Steffen Hessler, Dorothea Kolossa, and Robert M

Benedikt T. Boenninghoff, Steffen Hessler, Dorothea Kolossa, and Robert M. Nickel. Explainable Authorship Verification in Social Media via Attention-based Similarity Learning. In Chaitanya K. Baru, Jun Huan, Latifur Khan, Xiaohua Hu, Ronay Ak, Yuanyuan Tian, Roger S. Barga, Carlo Zaniolo, Kisung Lee, and Yanfang (Fanny) Ye, editors,2019 IEEE International...

work page 2019

[20] [20]

Boenninghoff, Robert M

Benedikt T. Boenninghoff, Robert M. Nickel, Steffen Zeiler, and Dorothea Kolossa. Similarity Learning for Authorship Verification in Social Media. InIEEE International Conference on Acous- tics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019, pages 2457–2461. IEEE, 2019

work page 2019

[21] [21]

Likelihood Ratios for Categorical Evidence: Comparison of LR Models Applied to Gunshot Residue Data.Law, Probability and Risk, 16(2-3):71–90, 09 2017

Annabel Bolck and Amalia Stamouli. Likelihood Ratios for Categorical Evidence: Comparison of LR Models Applied to Gunshot Residue Data.Law, Probability and Risk, 16(2-3):71–90, 09 2017

work page 2017

[22] [22]

Different Likelihood Ratio Approaches To Evaluate the Strength of Evidence of MDMA Tablet Comparisons

Annabel Bolck, Céline Weyermann, Laurence Dujourdy, Pierre Esseiva, and Jorrit Van Den Berg. Different Likelihood Ratio Approaches To Evaluate the Strength of Evidence of MDMA Tablet Comparisons. Forensic Science International, 191(1-3):42–51, 10 2009

work page 2009

[23] [23]

Authorship Verification for Short Messages Using Stylometry

Marcelo Luiz Brocardo, Issa Traoré, Sherif Saad, and Isaac Woungang. Authorship Verification for Short Messages Using Stylometry. InInternational Conference on Computer, Information and Telecommunication Systems, CITS 2013, Athens, Greece, May 7-8, 2013, pages 1–6. IEEE, 2013

work page 2013

[24] [24]

Authorship Verification of E-Mail and Tweet Messages Applied for Continuous Authentication.Journal of Computer and System Sciences, 81(8):1429 – 1440, 2015

Marcelo Luiz Brocardo, Issa Traore, and Isaac Woungang. Authorship Verification of E-Mail and Tweet Messages Applied for Continuous Authentication.Journal of Computer and System Sciences, 81(8):1429 – 1440, 2015

work page 2015

[25] [25]

Marcelo Luiz Brocardo, Issa Traoré, Isaac Woungang, and Mohammad S. Obaidat. Authorship Verification using Deep Belief Network Systems.Int. J. Communication Systems, 30(12), 2017

work page 2017

[26] [26]

Application-Independent Evaluation of Speaker Detection

Niko Brümmer and Johan du Preez. Application-Independent Evaluation of Speaker Detection. Computer Speech & Language, 20(2):230–275, 2006. Odyssey 2004: The speaker and Language Recognition Workshop

work page 2006

[27] [27]

Language, Usage and Cognition

Joan Bybee. Language, Usage and Cognition. Cambridge University Press, Cambridge, UK, 2010

work page 2010

[28] [28]

Linda Cappellato, Nicola Ferro, Gareth J. F. Jones, and Eric San Juan, editors.Working Notes for CLEF 2015 Conference, Toulouse, France, September 8–11, 2015, volume 1391 ofCEUR Workshop Proceedings. CEUR-WS.org, 2015. 35

work page 2015

[29] [29]

Authorship Verification, Average Similarity Analysis

Daniel Castro Castro, Yaritza Adame Arcia, María Pelaez Brioso, and Rafael Muñoz Guillena. Authorship Verification, Average Similarity Analysis. In Proceedings of the International Con- ference Recent Advances in Natural Language Processing, pages 84–90. INCOMA Ltd. Shoumen, BULGARIA, 2015

work page 2015

[30] [30]

Catoggio, J

D. Catoggio, J. Bunford, D. Taylor, G. Wevers, K. Ballantyne, and R. Morgan. An Introduc- tory Guide to Evaluative Reporting in Forensic Science.Australian Journal of Forensic Sciences, 51(sup1):S247–S251, February 2019

work page 2019

[31] [31]

C.E. Chaski. Empirical Evaluations of Language-Based Author Identification Techniques.Forensic Linguistics, 8(1):1–65, 2001

work page 2001

[32] [32]

Chen and Joshua Goodman

Stanley F. Chen and Joshua Goodman. An Empirical Study of Smoothing Techniques for Language Modeling. In Aravind K. Joshi and Martha Palmer, editors,34th Annual Meeting of the Association for Computational Linguistics, 24-27 June 1996, University of California, Santa Cruz, California, USA, Proceedings, pages 310–318. Morgan Kaufmann Publishers / ACL, 1996

work page 1996

[33] [33]

Chandramouli, and K

Xiaoling Chen, Peng Hao, R. Chandramouli, and K. P. Subbalakshmi. Authorship Similarity Detection from Email Messages. In Proceedings of the 7th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM’11, pages 375–386, Berlin, Heidelberg,

work page

[34] [34]

Christiansen and Nick Chater

Morten H. Christiansen and Nick Chater. The Now-or-Never Bottleneck: A Fundamental Con- straint on Language. Behavioral and Brain Sciences, 39:e62, 04 2016. Publisher: Cambridge University Press Citation Key: Christiansen2016

work page 2016

[35] [35]

All the News 2.0 — 2.7 Million News Articles and Essays from 27 American Publi- cations

Components. All the News 2.0 — 2.7 Million News Articles and Essays from 27 American Publi- cations. https://components.one/datasets/all-the-news-2-news-articles-dataset , 2017

work page 2017

[36] [36]

UsingSubsampling To Estimate the Strength of Handwriting Evidence via Score-Based Likelihood Ratios.Forensic Science International, 216(1-3):146–157, 03 2012

LindaJ.Davis, ChristopherP.Saunders, AmandaHepler, andJoAnnBuscaglia. UsingSubsampling To Estimate the Strength of Handwriting Evidence via Score-Based Likelihood Ratios.Forensic Science International, 216(1-3):146–157, 03 2012

work page 2012

[37] [37]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.arXiv preprint arXiv:1810.04805, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[38] [38]

Steven H. H. Ding, Benjamin C. M. Fung, and Mourad Debbabi. A Visualizable Evidence-Driven Approach for Authorship Attribution.ACM Trans. Inf. Syst. Secur., 17(3), March 2015

work page 2015

[39] [39]

Frequency in Language: Memory, Attention and Learning

Dagmar Divjak. Frequency in Language: Memory, Attention and Learning. Cambridge University Press, Cambridge, UK, 2019

work page 2019

[40] [40]

Language as a Phenomenon of the Third Kind.Cognitive Linguistics, 31(2):213– 229, 2020

Ewa Dąbrowska. Language as a Phenomenon of the Third Kind.Cognitive Linguistics, 31(2):213– 229, 2020

work page 2020

[41] [41]

Drygajlo, M

A. Drygajlo, M. Jessen, S. Gfrörer, I. Wagner, J. Vermeulen, T. Niemi, and Verlag für Polizeiwis- senschaft. Methodological Guidelines for Best Practice in Forensic Semiautomatic and Automatic Speaker Recognition. Verlag für Polizeiwissenschaft, 2015

work page 2015

[42] [42]

Authorship Verification for Hired Plagiarism Detection

Daniel Enriquez, Gage Christensen, Hayden Donovan, Jared Lam, Noah Wong, Sergiu Dascalu, David Feil-Seifer, and Emily Hand. Authorship Verification for Hired Plagiarism Detection. In Proceedings of the 9th International Conference on Applied Computing & Information Technology, ACIT ’22, page 19–24, New York, NY, USA, 2023. Association for Computing Machinery

work page 2023

[43] [43]

ENFSI guideline for evaluative reporting in forensic science, 2015

European Network of Forensic Science Institutes. ENFSI guideline for evaluative reporting in forensic science, 2015. Version 3.0

work page 2015

[44] [44]

Evett, G

I.w. Evett, G. Jackson, J.A. Lambert, and S. McCrossan. The Impact of the Principles of Evidence Interpretation on the Structure and Content of Statements.Science & Justice, 40(4):233–239, 10 2000

work page 2000

[45] [45]

CEUR-WS.org, 2014

Pamela Forner, Roberto Navigli, Dan Tufis, and Nicola Ferro, editors.Working Notes for CLEF 2013 Conference, Valencia, Spain, September 23–26, 2013, volume 1179 ofCEUR Workshop Pro- ceedings. CEUR-WS.org, 2014. 36

work page 2013

[46] [46]

Language Models and Fusion for Au- thorship Attribution.Information Processing & Management, 56(6):102061, 11 2019

Olga Fourkioti, Symeon Symeonidis, and Avi Arampatzis. Language Models and Fusion for Au- thorship Attribution.Information Processing & Management, 56(6):102061, 11 2019

work page 2019

[47] [47]

Al-Khateeb, Martin Potthast, Zinnar Ghasem, Mitul Shukla, and Emma Short

Ingo Frommholz, Haider M. Al-Khateeb, Martin Potthast, Zinnar Ghasem, Mitul Shukla, and Emma Short. On Textual Analysis and Machine Learning for Cyberstalking Detection.Datenbank- Spektrum, 16(2):127–135, 2016

work page 2016

[48] [48]

Lane, Steve Croker, Peter C-H

Fernand Gobet, Peter C.R. Lane, Steve Croker, Peter C-H. Cheng, Gary Jones, Iain Oliver, and Julian M Pine. Chunking Mechanisms in Human Learning.Trends in Cognitive Sciences, 5(6):236– 243, 2001. Citation Key: Gobet2001

work page 2001

[49] [49]

Recent Trends in Digital Text Forensics and its Evaluation

Tim Gollub, Martin Potthast, Anna Beyer, Matthias Busse, Francisco Rangel, Paolo Rosso, Efs- tathios Stamatatos, and Benno Stein. Recent Trends in Digital Text Forensics and its Evaluation. In Pamela Forner, Henning Müller, Roberto Paredes, Paolo Rosso, and Benno Stein, editors,Infor- mation Access Evaluation. Multilinguality, Multimodality, and Visualiza...

work page 2013

[50] [50]

Universal Dependencies and Author Attribution of Short Texts with Syntax Alone

Robert Gorman. Universal Dependencies and Author Attribution of Short Texts with Syntax Alone. Digital Humanities Quarterly, 16(2), 2022

work page 2022

[51] [51]

Practice-Oriented Authorship Verification

Oren Halvani. Practice-Oriented Authorship Verification. PhD thesis, Technical University of Darmstadt, Germany, 2021

work page 2021

[52] [52]

TextUnitLib: A Python Library That Allows Easy Extraction of a Variety of Text Units Within Texts.https://github.com/Halvani/TextUnitLib, 2024

Oren Halvani. TextUnitLib: A Python Library That Allows Easy Extraction of a Variety of Text Units Within Texts.https://github.com/Halvani/TextUnitLib, 2024

work page 2024

[53] [53]

POSNoise: An Effective Countermeasure Against Topic Biases in Authorship Analysis

Oren Halvani and Lukas Graner. POSNoise: An Effective Countermeasure Against Topic Biases in Authorship Analysis. InProceedings of the 16th International Conference on Availability, Reliability and Security, ARES ’21, New York, NY, USA, 2021. Association for Computing Machinery

work page 2021

[54] [54]

Cross-Domain Authorship Verification Based on Topic Agnostic Features

Oren Halvani, Lukas Graner, and Roey Regev. Cross-Domain Authorship Verification Based on Topic Agnostic Features. In Linda Cappellato, Carsten Eickhoff, Nicola Ferro, and Aurélie Névéol, editors,Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece, September 22-25, 2020, volume 2696 ofCEUR Workshop Proceedings. C...

work page 2020

[55] [55]

TAVeer: An Interpretable Topic-Agnostic Author- ship Verification Method

Oren Halvani, Lukas Graner, and Roey Regev. TAVeer: An Interpretable Topic-Agnostic Author- ship Verification Method. In Melanie Volkamer and Christian Wressnegger, editors,ARES 2020: The 15th International Conference on Availability, Reliability and Security, Virtual Event, Ireland, August 25-28, 2020, pages 41:1–41:10. ACM, 2020

work page 2020

[56] [56]

Authorship Verification in the Absence of Explicit Features and Thresholds

Oren Halvani, Lukas Graner, and Inna Vogel. Authorship Verification in the Absence of Explicit Features and Thresholds. In Gabriella Pasi, Benjamin Piwowarski, Leif Azzopardi, and Allan Han- bury, editors,Advances in Information Retrieval, pages 454–465. Springer International Publishing, 2018

work page 2018

[57] [57]

On the Usefulness of Compression Models for Authorship Verification

Oren Halvani, Christian Winter, and Lukas Graner. On the Usefulness of Compression Models for Authorship Verification. In Proceedings of the 12th International Conference on Availability, Reliability and Security, ARES ’17, pages 54:1–54:10, New York, NY, USA, 2017. ACM

work page 2017

[58] [58]

Assessing the Applicability of Authorship VerificationMethods

Oren Halvani, Christian Winter, and Lukas Graner. Assessing the Applicability of Authorship VerificationMethods. InProceedings of the 14th International Conference on Availability, Reliability and Security, ARES 2019, Canterbury, UK, August 26-29, 2019, pages 38:1–38:10. ACM, 2019

work page 2019

[59] [59]

Recognition of Compromised Accounts on Twitter

Rodrigo Augusto Igawa, Alex Marino Gonçalves de Almeida, Bruno Bogaz Zarpelão, and Sylvio Barbon. Recognition of Compromised Accounts on Twitter. In Sean W. M. Siqueira and Sérgio T. Carvalho, editors, Proceedings of the annual conference on Brazilian Symposium on Information Systems, Information Systems: A Computer Socio-Technical Perspective, SBSI 2015,...

work page 2015

[60] [60]

Khan, Benjamin C

Farkhund Iqbal, Liaquat A. Khan, Benjamin C. M. Fung, and Mourad Debbabi. E-mail Authorship Verification for Forensic Investigation. In Sung Y. Shin, Sascha Ossowski, Michael Schumacher, Mathew J. Palakal, and Chih-Cheng Hung, editors,Proceedings of the 2010 ACM Symposium on Applied Computing (SAC), Sierre, Switzerland, March 22-26, 2010, pages 1591–1598....

work page 2010

[61] [61]

A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram

Shunichi Ishihara. A Forensic Authorship Classification in SMS Messages: A Likelihood Ratio Based Approach Using N-gram. In Diego Molla and David Martinez, editors,Proceedings of the Australasian Language Technology Association Workshop 2011, pages 47–56, Canberra, Australia, December 2011

work page 2011

[62] [62]

Score-Based LikelihoodRatios forLinguisticTextEvidence With aBag-of-Words Model

ShunichiIshihara. Score-Based LikelihoodRatios forLinguisticTextEvidence With aBag-of-Words Model. Forensic Science International, 327:110980, 2021. Publisher: Elsevier

work page 2021

[63] [63]

Weight of Authorship Evidence With Multiple Categories of Stylometric Fea- tures: A Multinomial-Based Discrete Model.Science & Justice, 63(2):181–199, March 2023

Shunichi Ishihara. Weight of Authorship Evidence With Multiple Categories of Stylometric Fea- tures: A Multinomial-Based Discrete Model.Science & Justice, 63(2):181–199, March 2023

work page 2023

[64] [64]

Likelihood Ratio Estimation for Authorship Text Evidence: An Empirical Comparison of Score- and Feature-Based Methods.Forensic Science International, 334:111268, May 2022

Shunichi Ishihara and Michael Carne. Likelihood Ratio Estimation for Authorship Text Evidence: An Empirical Comparison of Score- and Feature-Based Methods.Forensic Science International, 334:111268, May 2022

work page 2022

[65] [65]

Validation in Forensic Text Comparison: Issues and Opportunities.Languages, 9(2):47, February 2024

Shunichi Ishihara, Sonia Kulkarni, Michael Carne, Sabine Ehrhardt, and Andrea Nini. Validation in Forensic Text Comparison: Issues and Opportunities.Languages, 9(2):47, February 2024. Number: 2 Publisher: Multidisciplinary Digital Publishing Institute

work page 2024

[66] [66]

Estimating the Strength of Authorship Evidence With a Deep-Learning-Based Approach

Shunichi Ishihara, Satoru Tsuge, Mitsuyuki Inaba, and Wataru Zaitsu. Estimating the Strength of Authorship Evidence With a Deep-Learning-Based Approach. In Pradeesh Parameswaran, Jennifer Biggs, and David Powers, editors,Proceedings of the The 20th Annual Workshop of the Australasian Language Technology Association, pages183–187, Adelaide, Australia, Dece...

work page

[67] [67]

Authorship Verifica- tion Applied to Detection of Compromised Accounts on Online Social Networks – A Continuous Approach

Sylvio Barbon Junior, Rodrigo Augusto Igawa, and Bruno Bogaz Zarpelão. Authorship Verifica- tion Applied to Detection of Compromised Accounts on Online Social Networks – A Continuous Approach. Multim. Tools Appl., 76(3):3213–3233, 2017

work page 2017

[68] [68]

Overview of the Author Identification Task at PAN 2013

Patrick Juola and Efstathios Stamatatos. Overview of the Author Identification Task at PAN 2013. In Forner et al. [45]

work page 2013

[69] [69]

Function Words in Authorship Attribution

Mike Kestemont. Function Words in Authorship Attribution. From Black Magic to Theory? In Anna Feldman, Anna Kazantseva, and Stan Szpakowicz, editors,Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL),pages59–66, Gothenburg, Sweden, April2014. Association for Computational Linguistics

work page

[70] [70]

Overview of the Cross-Domain Authorship Verifi- cation Task at PAN 2020

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Martin Potthast, and Benno Stein. Overview of the Cross-Domain Authorship Verifi- cation Task at PAN 2020. In Linda Cappellato, Carsten Eickhoff, Nicola Ferro, and Aurélie Névéol, editors,Working Notes of CLEF 2020 - Conference and Labs of the Evaluat...

work page 2020

[71] [71]

Overview of the Cross-Domain Authorship Veri- fication Task at PAN 2021

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Benno Stein, and Martin Potthast. Overview of the Cross-Domain Authorship Veri- fication Task at PAN 2021. In Guglielmo Faggioli, Nicola Ferro, Alexis Joly, Maria Maistro, and Florina Piroi, editors,Working Notes Papers of the CLEF 2021 Evaluation Lab...

work page 2021

[72] [72]

Authenticating the Writings of Julius Caesar.Expert Syst

Mike Kestemont, Justin Anthony Stover, Moshe Koppel, Folgert Karsdorp, and Walter Daelemans. Authenticating the Writings of Julius Caesar.Expert Syst. Appl., 63:86–96, 2016

work page 2016

[73] [73]

A Slightly-Modified GI-Based Author-Verifier with Lots of Features (ASGALF)

Mahmoud Khonji and Youssef Iraqi. A Slightly-Modified GI-Based Author-Verifier with Lots of Features (ASGALF). In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes for CLEF 2014 Conference, Sheffield, UK, September 15-18, 2014., volume 1180 of CEUR Workshop Proceedings, pages 977–983. CEUR-WS.org, 2014

work page 2014

[74] [74]

Improved Score Aggregation for Authorship Verification

Mahmoud Khonji, Youssef Iraqi, and Loubna Mekouar. Improved Score Aggregation for Authorship Verification. Knowledge and Information Systems, 12 2022

work page 2022

[75] [75]

The Enron Corpus: A New Dataset for Email Classification Research

Bryan Klimt and Yiming Yang. The Enron Corpus: A New Dataset for Email Classification Research. In Jean-François Boulicaut, Floriana Esposito, Fosca Giannotti, and Dino Pedreschi, editors, Machine Learning: ECML 2004, pages 217–226, Berlin, Heidelberg, 2004. Springer Berlin Heidelberg. 38

work page 2004

[76] [76]

Kneser and H

R. Kneser and H. Ney. Improved Backing-off for M-gram Language Modeling. In1995 International Conference on Acoustics, Speech, and Signal Processing, volume 1, pages 181–184 vol.1, 1995

work page 1995

[77] [77]

UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015

Mirco Kocher and Jacques Savoy. UniNE at CLEF 2015 Author Identification: Notebook for PAN at CLEF 2015. In CLEF (Working Notes), volume 1391 ofCEUR Workshop Proceedings. CEUR-WS.org, 2015

work page 2015

[78] [78]

A Simple and Efficient Algorithm for Authorship Verification

Mirco Kocher and Jacques Savoy. A Simple and Efficient Algorithm for Authorship Verification. Journal of the Association for Information Science and Technology, 68(1):259–269, 2017

work page 2017

[79] [79]

Authorship Verification as a One-Class Classification Prob- lem

Moshe Koppel and Jonathan Schler. Authorship Verification as a One-Class Classification Prob- lem. In Carla E. Brodley, editor,Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, July 4-8, 2004, volume 69 ofACM International Conference Proceeding Series. ACM, 2004

work page 2004

[80] [80]

Authorship Attribution in the Wild.Lan- guage Resources and Evaluation, 45(1):83–94, 2011

Moshe Koppel, Jonathan Schler, and Shlomo Argamon. Authorship Attribution in the Wild.Lan- guage Resources and Evaluation, 45(1):83–94, 2011

work page 2011