pith. sign in

arxiv: 2604.12442 · v1 · submitted 2026-04-14 · 💻 cs.CL

GLeMM: A large-scale multilingual dataset for morphological research

Pith reviewed 2026-05-10 15:06 UTC · model grok-4.3

classification 💻 cs.CL
keywords derivational morphologymultilingual datasetWiktionaryword formationmorphological annotationEuropean languagescomputational morphology
0
0 comments X

The pith

GLeMM supplies a large automated multilingual dataset of derivational morphology drawn from Wiktionary to support data-driven analysis of word formation across languages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GLeMM as a new resource for investigating how form and meaning interact when new words are created. It is constructed through an identical automated process applied to Wiktionary for seven languages, with added annotations for morphological features and semantic descriptions on many entries. This scale and consistency allow studies of derivational morphology to move past small hand-selected examples toward replicable, generalizable results. A sympathetic reader would care because prior work on these questions has often relied on limited data or intuition, limiting what can be firmly established.

Core claim

GLeMM is a derivational resource of large size with coverage of seven European languages, a fully automated design that is identical across languages, automatic annotation of morphological features on each entry, and encoding of semantic descriptions for a significant subset. Created from Wiktionary articles, it enables researchers to address questions such as the role of form and meaning in word-formation and to develop and test computational methods that identify the structures of derivational morphology.

What carries the argument

The automated extraction and annotation pipeline applied identically to Wiktionary articles, which generates entries carrying morphological annotations and partial semantic descriptions.

If this is right

  • Researchers can now examine the role of form and meaning in word-formation with large-scale, consistent data instead of limited observations.
  • Computational methods for identifying derivational morphology structures can be developed and tested experimentally on the same resource.
  • Morphological studies can be replicated and generalized across German, English, Spanish, French, Italian, Polish, and Russian.
  • Data-driven description in morphology becomes feasible beyond intuition-based approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dataset could serve as training material for machine learning systems that process word formation in new texts.
  • Direct comparisons of annotated patterns across the seven languages might highlight both shared and language-specific tendencies in derivation.
  • Applying the same pipeline to additional languages would extend the scope for testing claims about universal aspects of morphology.

Load-bearing premise

The automated extraction and annotation pipeline applied identically to Wiktionary articles produces accurate and consistent morphological information across all seven languages without significant language-specific errors or coverage gaps.

What would settle it

A manual verification of randomly sampled entries from each of the seven languages that finds frequent inaccuracies or inconsistencies in the morphological annotations would show the resource cannot support reliable research.

read the original abstract

In derivational morphology, what mechanisms govern the variation in form-meaning relations between words? The answers to this type of questions are typically based on intuition and on observations drawn from limited data, even when a wide range of languages is considered. Many of these studies are difficult to replicate and generalize. To address this issue, we present GLeMM, a new derivational resource designed for experimentation and data-driven description in morphology. GLeMM is characterized by (i) its large size, (ii) its extensive coverage (currently amounting to seven European languages, i.e., German, English, Spanish, French, Italian, Polish, Russian, (iii) its fully automated design, identical across all languages, (iv) the automatic annotation of morphological features on each entry, as well as (v) the encoding of semantic descriptions for a significant subset of these entries. It enables researchers to address difficult questions, such as the role of form and meaning in word-formation, and to develop and experimentally test computational methods that identify the structures of derivational morphology. The article describes how GLeMM is created using Wiktionary articles and presents various case studies illustrating possible applications of the resource.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript presents GLeMM, a large-scale multilingual dataset for derivational morphology covering seven European languages (German, English, Spanish, French, Italian, Polish, Russian). It is constructed via a fully automated, identical pipeline from Wiktionary articles, with automatic annotation of morphological features on each entry and semantic descriptions for a significant subset. The paper describes the creation process and includes case studies to illustrate applications for studying form-meaning relations in word-formation and for developing/testing computational methods in derivational morphology.

Significance. If the automated pipeline produces accurate and consistent annotations, GLeMM would be a valuable resource: its scale, uniform cross-lingual design, and coverage of derivational data address a gap in replicable, data-driven morphological research. The automated construction and inclusion of semantic encodings are strengths that could support large-scale experiments on form-meaning mappings and method evaluation.

major comments (3)
  1. [§3 (Construction Pipeline)] Construction section (described in the abstract and §3): The central claim that the identical automated extraction and annotation pipeline yields reliable morphological information across all seven languages lacks any quantitative validation. No precision, recall, error rates, coverage statistics, or gold-standard comparisons are reported, despite acknowledged differences in Wiktionary article structure and quality by language. This directly undermines the weakest assumption that the resource reliably supports the stated research questions.
  2. [§4 (Annotations)] Annotation and semantic encoding (abstract and §4): While morphological features are automatically annotated and semantic descriptions are provided for a subset, the paper provides no details on validation of these annotations (e.g., inter-annotator agreement or per-language accuracy metrics). This is load-bearing for claims about enabling form-meaning analyses.
  3. [§6 (Case Studies)] Case studies (§6): The applications for addressing questions on word-formation and testing computational methods are illustrated but without any empirical assessment of dataset quality or utility in those tasks, such as baseline experiments or error analysis on the extracted data.
minor comments (3)
  1. [Introduction] The abstract and introduction would benefit from explicit comparison to existing morphological resources (e.g., UniMorph or derivational databases) to clarify novelty in scale and automation.
  2. [§4] Notation for morphological features and semantic encodings should be defined more clearly, perhaps with an example table entry for each language.
  3. [§6] The paper mentions 'various case studies' but the text would be improved by a summary table linking each study to specific dataset properties used.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and positive evaluation of GLeMM's potential significance. We address each major comment below and describe the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [§3 (Construction Pipeline)] Construction section (described in the abstract and §3): The central claim that the identical automated extraction and annotation pipeline yields reliable morphological information across all seven languages lacks any quantitative validation. No precision, recall, error rates, coverage statistics, or gold-standard comparisons are reported, despite acknowledged differences in Wiktionary article structure and quality by language. This directly undermines the weakest assumption that the resource reliably supports the stated research questions.

    Authors: We agree that quantitative validation metrics would strengthen the presentation of the pipeline's reliability. The manuscript's primary focus is on documenting the uniform, fully automated construction process that enables cross-lingual comparability. In the revision, we will add to §3 coverage statistics (entries and features per language), precision/recall estimates from manual inspection of random samples (200 entries per language), and a discussion of how the pipeline handles Wiktionary structural differences. These additions will provide concrete support for the resource's usability. revision: yes

  2. Referee: [§4 (Annotations)] Annotation and semantic encoding (abstract and §4): While morphological features are automatically annotated and semantic descriptions are provided for a subset, the paper provides no details on validation of these annotations (e.g., inter-annotator agreement or per-language accuracy metrics). This is load-bearing for claims about enabling form-meaning analyses.

    Authors: We acknowledge that the current version lacks explicit validation details for the automatic annotations. We will expand §4 to describe the annotation heuristics and rules in greater detail, report per-language accuracy figures obtained by comparing a held-out sample against manual gold standards, and clarify the extraction and coverage of semantic descriptions. This will better substantiate the dataset's value for form-meaning research. revision: yes

  3. Referee: [§6 (Case Studies)] Case studies (§6): The applications for addressing questions on word-formation and testing computational methods are illustrated but without any empirical assessment of dataset quality or utility in those tasks, such as baseline experiments or error analysis on the extracted data.

    Authors: The case studies are illustrative of potential uses rather than exhaustive evaluations. We agree that adding empirical elements would improve the section. In the revision, we will incorporate a baseline experiment in one case study (e.g., a rule-based derivational relation identifier evaluated on GLeMM data) together with performance metrics and error analysis. This will demonstrate practical utility while remaining within the scope of a dataset paper. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset construction is self-contained

full rationale

The paper describes the automated extraction of GLeMM from Wiktionary articles via a uniform pipeline, with automatic morphological annotation and partial semantic encoding. No equations, fitted parameters, predictions, or derivations appear in the abstract or described content. Claims about enabling research on form-meaning relations follow directly from the stated size, coverage, and identical processing across languages, without any reduction to self-defined quantities or load-bearing self-citations. This matches the expected non-circular outcome for a resource paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the quality and consistency of Wiktionary source data plus the correctness of the automated extraction pipeline; no free parameters or invented entities are introduced.

axioms (1)
  • domain assumption Wiktionary articles contain sufficiently accurate and structured morphological information that can be extracted automatically and uniformly across languages
    The entire resource is built from Wiktionary using an identical automated design.

pith-pipeline@v0.9.0 · 5544 in / 1220 out tokens · 34335 ms · 2026-05-10T15:06:55.946022+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

117 extracted references · 117 canonical work pages

  1. [1]

    Ackerman, Farrell, James P Blevins & Robert Malouf. 2009. Parts and wholes: Implicative patterns in inflectional paradigms. In James P Blevins & Juliette Blevins (eds.), Analogy in grammar: Form and acquisition, 54--81. Oxford: Oxford University Press

  2. [2]

    Albright, Adam & Bruce Hayes. 2003. Rules vs. analogy in E nglish past tenses: a computational/experimental study. Cognition 90(2). 119--161. doi:https://doi.org/10.1016/S0010-0277(03)00146-X

  3. [3]

    Arndt-Lappe, Sabine. 2014. Analogy in suffix rivalry: the case of english -ity and -ness . English Language and Linguistics 18. 497--548

  4. [4]

    Aronoff, Mark. 1976. Word formation in generative grammar Linguistic Inquiry Monographs. Cambridge, MA: MIT Press

  5. [5]

    Aronoff, Mark. 2019. Competitors and alternants in linguistic morphology. In Franz Rainer, Wolfgang U. Dressler & Hans Christian Luschützky (eds.), Competition in inflection and word-formation, 39--66. Springer

  6. [6]

    Harald, Richard Piepenbrock & Leon Gulikers

    Baayen, R. Harald, Richard Piepenbrock & Leon Gulikers. 1995. The CELEX lexical database (release 2). CD-ROM. Linguistic Data Consortium, Philadelphia, PA

  7. [7]

    Bagasheva, Alexandra. 2017. Comparative semantic concepts in affixation. In Santana Lario Juan & Salvador Valera (eds.), Competing patterns in E nglish affixation , 33--65. Peter Lang Bern

  8. [8]

    Barque, Lucie, Pauline Haas, Richard Huyghe, Delphine Tribout, Marie Candito, Benoit Crabbé & Vincent Segonne. 2020. FrSemCor : Annotating a F rench corpus with supersenses. In 12th edition of its language resources and evaluation conference ( LREC ) , ELRA. ://hal.archives-ouvertes.fr/hal-02511929

  9. [9]

    Batsuren, Khuyagbaatar, Gabor Bella & Fausto Giunchiglia. 2019. C og N et: A large-scale cognate database. In Proceedings of the 57th annual meeting of the association for computational linguistics, 3136--3145. Florence, Italy

  10. [10]

    Batsuren, Khuyagbaatar, G \'a bor Bella & Fausto Giunchiglia. 2021. M orphy N et: a large multilingual database of derivational and inflectional morphology. In Proceedings of the 18th SIGMORPHON workshop on computational research in phonetics, phonology, and morphology , 39--48

  11. [11]

    Batsuren, Khuyagbaatar, Omer Goldman & al. 2022. U ni M orph 4.0: U niversal M orphology. In Proceedings of the thirteenth language resources and evaluation conference, 840--855. Marseille, France

  12. [12]

    Bauer, Laurie. 2017. Compounds and compounding. Cambridge University Press

  13. [13]

    Beniamine, Sacha. 2018. Classifications flexionnelles. \'E tude quantitative des structures de paradigmes : Univeristé Paris Diderot Thèse de doctorat

  14. [14]

    Beniamine, Sacha & Mat \'i as Guzm \'a n Naranjo. 2021. Multiple alignments of inflectional paradigms. In Proceedings of the society for computation in linguistics 2021, 216--227

  15. [15]

    Bobkova, Natalia. 2025. La concurrence suffixale dans la construction des adjectifs dénominaux en russe : analyse des suffixes -n- , -sk- et -ov : Université de Toulouse Thèse de doctorat

  16. [16]

    Bobkova, Natalia & Fabio Montermini. 2023. A quantitative approach to doublets in R ussian denominal adjective construction. Word Structure 16(1). 63--86. doi:10.3366/word.2023.0221

  17. [17]

    Bonami, Olivier & Sacha Beniamine. 2016. Joint predictiveness in inflectional paradigms. Word Structure 9(2). 156--182

  18. [18]

    Calderone, Basilio, Franck Sajous & Nabil Hathout. 2016. GLAW-IT : A free large I talian dictionary encoded in a fine-grained XML format. In Proceedings of the 49th annual meeting of the societas linguistica europaea (sle 2016), 43--45. Naples, Italy

  19. [19]

    Cardillo, Alberto Franco, Marcello Ferro, Claudia Marzi & Vito Pirrelli. 2018. Deep learning of inflection and the cell-filling problem. Italian Journal of Computational Linguistics 4(1). 57--75

  20. [20]

    Cotterell, Ryan & Hinrich Schütze. 2018. Joint semantic synthesis and morphological analysis of the derived word. Transactions of the Association for Computational Linguistics 6. 33--48

  21. [21]

    Creutz, Mathias & Krista Lagus. 2002. Unsupervised discovery of morphemes. In Proceedings of the ACL workshop on morphological and phonological learning , 21--30. Philadelphia, PA: ACL

  22. [22]

    Creutz, Mathias & Krista Lagus. 2004. Induction of a simple morphology for highly-inflecting languages. In Proceedings of the 7th meeting of the ACL special interest group in computational phonology: Current themes in computational phonology and morphology , 43--51. Barcelona, Spain

  23. [23]

    Creutz, Mathias & Krista Lagus. 2005. Unsupervised morpheme segmentation and morphology induction from text corpora using M orfessor 1.0. Tech. Rep. A81 Helsinki University of Technology

  24. [24]

    Dal, Georgette & Fiammetta Namer. 2022. \'Eco- lave plus vert, et il lave toute la famille. Neologica 16. 111--128. doi:10.48611/isbn.978-2-406-13219-6.p.0111

  25. [25]

    Dendien, Jacques & Jean-Marie Pierrel. 2003. Le T résor de la L angue F rançaise informatisé: un exemple d'informatisation d'un dictionnaire de langue de référence. Traitement automatique des langues 44(2). 11--37

  26. [26]

    Fellbaum, Christiane. 1998. Wordnet: An electronic lexical database. MIT Press

  27. [27]

    Fellbaum, Christiane (ed.). 1999. Wordnet: an electronic lexical database. Cambridge, MA: MIT Press

  28. [28]

    Fellbaum, Christiane, Anne Osherson & Peter E. Clark. 2009. Putting semantics into W ord N et's ``morphosemantic'' links. In Human language technology. challenges of the information society, vol. 5603 Lecture Notes in Computer Science Volume, 350--358. Springer

  29. [29]

    Fradin, Bernard. 2019. Competition in derivation: What can we learn from F rench doublets in -age and -ment ? In Franz Rainer, Francesco Gardani, Wolfgang U. Dressler & Hans Christian Luschützky (eds.), Competition in inflection and word-formation, 67--93. Springer

  30. [30]

    Gage, Philip. 1994. A new algorithm for data compression. C Users Journal 12(2). 23–38

  31. [31]

    Goldsmith, John. 2001. Unsupervised learning of the morphology of natural language. Computational Linguistics 27(2). 153--198

  32. [32]

    Goldsmith, John. 2006. An algorithm for the unsupervised learning of morphology. Natural Language Engineering 12(4). 353--371

  33. [33]

    Guzm \'a n Naranjo, Mat \' as. 2020. Analogy, complexity and predictability in the R ussian nominal inflection system. Morphology 30(3). 219--262

  34. [34]

    Habash, Nizar & Bonnie Dorr. 2003. A categorial variation database for E nglish. In Proceedings of the human language technology and north american association for computational linguistics conference (naacl/hlt 2003), 96--102. Edmonton: ACL

  35. [35]

    Hathout, Nabil. 2001. Analogies morpho-synonymiques. U ne méthode d'acquisition automatique de liens morphologiques à partir d'un dictionnaire de synonymes. In Denis Maurel (ed.), Actes de la 8 \ conférence annuelle sur le traitement automatique des langues naturelles (taln-2001), 223--232. Tours: ATALA

  36. [36]

    Hathout, Nabil. 2002. From WordNet to CELEX : A cquiring morphological links from dictionaries of synonyms. In Proceedings of the third international conference on language resources and evaluation, 1478--1484. Las Palmas de Gran Canaria: ELRA

  37. [37]

    Hathout, Nabil. 2005. Exploiter la structure analogique du lexique construit : U ne approche computationnelle. Cahiers de lexicologie 87(2). 5--28

  38. [38]

    Hathout, Nabil. 2008. Acquisition of the morphological structure of the lexicon based on lexical similarity and formal analogy. In Proceedings of the coling workshop textgraphs-3, 1--8. Manchester: ACL

  39. [39]

    Hathout, Nabil. 2009 a . Acquisition morphologique à partir d'un dictionnaire informatisé. In Actes de la 16 \ conférence sur le traitement automatique des langues naturelles (taln-2009), Senlis: ATALA

  40. [40]

    Hathout, Nabil. 2009 b . Acquisition of morphological families and derivational series from a machine readable dictionary. In Fabio Montermini, Gilles Boyé & Jesse Tseng (eds.), Selected proceedings of the 6th décembrettes: Morphology in bordeaux, Somerville, MA: Cascadilla Proceedings Project

  41. [41]

    Hathout, Nabil. 2011 a . Morphonette: a paradigm-based morphological network. Lingue e linguaggio 2011(2). 243--262

  42. [42]

    Hathout, Nabil. 2011 b . Une approche topologique de la construction des mots : propositions théoriques et application à la préfixation en anti- . In Michel Roché, Gilles Boyé, Nabil Hathout, Stéphanie Lignon & Marc Plénat (eds.), Des unités morphologiques au lexique, 251--318. Hermès Science-Lavoisier

  43. [43]

    Hathout, Nabil. 2014. Phonotactics in morphological similarity metrics. Language Sciences 46. 71--83

  44. [44]

    Hathout, Nabil. 2016. La question des données en morphologie. Cahiers de l'ILSL 45. 123--160

  45. [45]

    Hathout, Nabil, Basilio Calderone, Franck Sajous & Fiammetta Namer. 2025. Form and meaning in word-formation: Who does what? Manuscript

  46. [46]

    Hathout, Nabil, Fabio Montermini & Ludovic Tanguy. 2008. Extensive data for morphology: U sing the W orld W ide W eb. Journal of F rench Language Studies 18(1). 67--85

  47. [47]

    Hathout, Nabil & Fiammetta Namer. 2014. Démonette, a F rench derivational morpho-semantic network. Linguistic Issues in Language Technology 11(5). 125--168

  48. [48]

    Hathout, Nabil & Fiammetta Namer. 2016. Giving lexical resources a second life: D émonette, a multi-sourced morpho-semantic network for F rench. In Proceedings of the tenth international conference on language resources and evaluation ( LREC 2016) , Portorož, Slovenia

  49. [49]

    Hathout, Nabil & Fiammetta Namer. 2018. La parasynthèse à travers les modèles : des RCL au P ara D is. In Olivier Bonami, Gilles Boyé, Georgette Dal, Hélène Giraudo & Fiammetta Namer (eds.), The lexeme in descriptive and theorical morphology, 365--399. Langage Sciences Press

  50. [50]

    Hathout, Nabil & Fiammetta Namer. 2025. What do derivational paradigms tell us about back-formation and what does back-formation tell us about derivational paradigms? Word Structure 18(3). 239--280

  51. [51]

    Hathout, Nabil, Fiammetta Namer, Marc Plénat & Ludovic Tanguy. 2009. La collecte et l'utilisation des données en morphologie. In Bernard Fradin, Françoise Kerleroux & Marc Plénat (eds.), Aperçus de morphologie du français, 267--287. Saint-Denis: Presses universitaires de Vincennes

  52. [52]

    Hathout, Nabil & Franck Sajous. 2016. Wiktionnaire's W ikicode GLAWI fied: a workable F rench machine-readable dictionary. In Proceedings of the tenth international conference on language resources and evaluation ( LREC 2016) , Portorož, Slovenia

  53. [53]

    Hathout, Nabil, Franck Sajous & Basilio Calderone. 2014. Acquisition and enrichment of morphological and morphosemantic knowledge from the F rench W iktionary. In Proceedings of the COLING workshop on lexical and grammatical resources for language processing , 65--74. Dublin, Ireland

  54. [54]

    Hathout, Nabil, Franck Sajous, Basilio Calderone & Fiammetta Namer. 2020. G lawinette: a linguistically motivated derivational description of F rench acquired from GLAWI . In Proceedings of the twelfth international conference on language resources and evaluation ( LREC 2020) , 3870--3878. Marseille

  55. [55]

    Hay, Jennifer & Harald Baayen. 2003. Phonotactics, parsing and productivity. Italian Journal of Linguistics 15(1). 99–130

  56. [56]

    Hledíková, Hana & Magda Ševčíková. 2024. Conversion in languages with different morphological structures: a semantic comparison of E nglish and C zech. Morphology 34(1). 73--102. doi:10.1007/s11525-024-09422-1

  57. [57]

    Huguin, Mathilde, Lucie Barque, Pauline Haas & Delphine Tribout. 2023. Typage sémantique des noms dans la ressource morphologique D émonette. Lexique 33. 41--56. doi:10.54563/lexique.1086. ://www.peren-revues.fr/lexique/1086

  58. [58]

    Huyghe, Richard & Rossella Varvara. 2023. Affix rivalry: Theoretical and methodological challenges. Word Structure 16(1). 1--23

  59. [59]

    Harald Baayen

    de Jong, Nivja H., Robert Schreuder & R. Harald Baayen. 2000. The morphological family size effect and morphology. Language and cognitive processes 15(4/5). 329--365

  60. [60]

    Kann, Katharina & Hinrich Sch \"u tze. 2016. Single-model encoder-decoder with explicit morphological representation for reinflection. In Katrin Erk & Noah A. Smith (eds.), Proceedings of the 54th annual meeting of the association for computational linguistics (volume 2: Short papers), 555--560. Berlin, Germany: Association for Computational Linguistics

  61. [61]

    Kelling, Carmen. 2001. Agentivity and suffix selection. In Proceedings of the LFG conference , 147--162. Stanford, CA: CSLI

  62. [62]

    Koehl, Aurore. 2012. La construction morphologique des noms désadjectivaux suffixés en français. Nancy: Université de L orraine Thèse de doctorat

  63. [63]

    Koehl, Aurore & Stéphanie Lignon. 2014. Property nouns with -ité and -itude: formal alternation and morphopragmatics or the sad-itude of the A ité _ N . Morphology 24(4). 351--376

  64. [64]

    Kyj \' a nek, Luk \' a s . 2018. Morphological resources of derivational word-formation relations. Tech. Rep. 61 \' U FAL - Charles University Prague

  65. [65]

    Kyj \'a nek, Luk \'a s , Olga Lyashevskaya, Anna Nedoluzhko, Daniil Vodolazsky & Zden e k Z abokrtsk \'y . 2022. Constructing a lexical resource of R ussian derivational morphology. In Proceedings of the thirteenth language resources and evaluation conference, 2788--2797. Marseille, France

  66. [66]

    Kyj \'a nek, Luk \'a s , Zden e k Z abokrtsk \'y , Jon \'a s Vidra & Magda S ev c \' kov \'a . 2021. Universal derivations v1.1. LINDAT / CLARIAH - CZ digital library at the Institute of Formal and Applied Linguistics ( \'U FAL ), Faculty of Mathematics and Physics, Charles University

  67. [67]

    Kyjánek, Lukáš, Zdenĕk Žabokrtský, Magda Ševčíková & Jonáš Vidra. 2020. U niversal D erivations 1.0, a growing collection of harmonised word-formation resources. The Prague Bulletin of Mathematical Linguistics 115. 5--30

  68. [68]

    Langlais, Philippe & François Yvon. 2008. Scaling up analogical learning. In Proceedings of the 22nd international conference on computational linguistics (coling 2008), 51–54. Manchester

  69. [69]

    Lango, Mateusz, Magda S ev c \'i kov \'a & Zden e k Z abokrtsk \'y . 2018. Semi-automatic construction of word-formation networks (for P olish and S panish). In Proceedings of the eleventh international conference on language resources and evaluation ( LREC 2018) , Miyazaki, Japan

  70. [70]

    Lango, Mateusz, Zdenĕk Žabokrtský & Magda Ševčíková. 2021. Semi-automatic construction of word-formation networks. Language Resources and Evaluation 55. 3--32. doi:10.1007/s10579-019-09484-2

  71. [71]

    Lavallée, Jean-François & Philippe Langlais. 2009. Morphological acquisition by formal analogy. In Working notes for the morphochallenge at clef 2009, Corfu, Greece

  72. [72]

    Lepage, Yves. 1998. Solving analogies on words: A n algorithm. In Proceedings of the 36th annual meeting of the association for computational linguistics and of the 17th international conference on computational linguistics, vol. 2, 728--735. Montréal

  73. [73]

    Lepage, Yves. 2003. De l'analogie rendant compte de la commutation en linguistique. Grenoble: Université Joseph Fourier Habilitation à diriger des recherches

  74. [74]

    Lepage, Yves. 2004. Analogy and formal languages. Electronic Notes in Theoretical Computer Science 53. 180--191. Proceedings of the the 6th Conference on Formal Grammar and the 7th on the Mathematics of Language (FG/MOL-2001)

  75. [75]

    Lignon, Stéphanie, Georgette Dal, Nabil Hathout & Fiammetta Namer. 2025. La morphophonologie est-elle paradigmatique ? P hononette vous répond. Langue Française 228. 59--16

  76. [76]

    Lignon, Stéphanie, Fiammetta Namer & Florence Villoing. 2014. De l'agglutination à la triangulation ou comment expliquer certaines séries morphologiques. In Actes du 4 \ congrès mondial de linguistique française ( CMLF 2014) , 1813--1836

  77. [77]

    Lignon, Stéphanie & Michel Roché. 2011. Entre histoire et morphophonologie, quelle distribution pour -éen vs -ien ? In Michel Roché, Gilles Boyé, Nabil Hathout, Stéphanie Lignon & Marc Plénat (eds.), Des unités morphologiques au lexique, 191--250. Hermès Science-Lavoisier

  78. [78]

    Lindsay, Mark & Mark Aronoff. 2013. Natural selection in self-organizing morphological systems. In Nabil Hathout, Fabio Montermini & Jesse Tseng (eds.), M orphology in T oulouse , 133--153. München: Lincom Europa

  79. [79]

    Malouf, Rob. 2017. Abstractive morphological learning with a recurrent neural network. Morphology 27(4). 431–458

  80. [80]

    Marchand, Hans. 1969. The categories and types of present-day E nglish word-formation: A synchronic-diachronic approach . Beck

Showing first 80 references.