arxiv: 2604.21365 · v1 · submitted 2026-04-23 · 💻 cs.LG · cs.AI· cs.CL· cs.SE

Recognition: unknown

mcdok at SemEval-2026 Task 13: Finetuning LLMs for Detection of Machine-Generated Code

Adam Skurla , Dominik Macko , Jakub Simko

Authors on Pith no claims yet

Pith reviewed 2026-05-09 22:39 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CLcs.SE

keywords detectioncodemachine-generatedsystemstaskvarioussemeval-2026submitted

0 comments

The pith

Fine-tuning LLMs by adapting the mdok approach produces competitive results on binary detection, source attribution, and hybrid/adversarial code identification in SemEval-2026 Task 13.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The competition task asks systems to decide if a code snippet was written by a human or a machine, which machine model created it, or whether it is a mix of both or has been altered to hide its origin. The team started from their earlier mdok system for fake text and tried different base language models that are better at understanding code. They submitted several versions and found that these adjusted systems performed well enough to rank competitively across the subtasks, although they did not reach the very top scores.

Core claim

The results indicate that the submitted systems are competitive in all three subtasks.

Load-bearing premise

That simply exploring various base models more suitable for code understanding will produce effective detection systems without major architectural changes or detailed ablation studies.

Figures

Figures reproduced from arXiv: 2604.21365 by Adam Skurla, Dominik Macko, Jakub Simko.

**Figure 1.** Figure 1: mcdok systems overview. training (belonging to the LLM family that has been seen in training). Subtask C represents hybrid code detection, where the goal is to distinguish between 4 classes: human-written, machine-generated, hybrid (i.e., partially written or completed by LLM), and adversarial (i.e., generated to mimic humans). In all subtasks, our system originated in modification of the existing system… view at source ↗

**Figure 2.** Figure 2: Per-language performance of mcdok system [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Per-family performance of mcdok system in [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Per-language performance (Macro F1) of mc [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Confusion matrix of mcdok system in subtask [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 7.** Figure 7: Confusion matrix of mcdok system in subtask [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

**Figure 6.** Figure 6: Per-language performance (Macro F1) of mc [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

read the original abstract

Multi-domain detection of the machine-generated code snippets in various programming languages is a challenging task. SemEval-2026 Task~13 copes with this challenge in various angles, as a binary detection problem as well as attribution of the source. Specifically, its subtasks also cover generator LLM family detection, as well as a hybrid code co-generated by humans and machines, or adversarially modified codes hiding its origin. Our submitted systems adjusted the existing mdok approach (focused on machine-generated text detection) to these specific kinds of problems by exploring various base models, more suitable for code understanding. The results indicate that the submitted systems are competitive in all three subtasks. However, the margins from the top-performing systems are significant, and thus further improvements are possible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a standard shared-task system report adapting mdok to code detection with no performance numbers or analysis to support the competitiveness claim.

read the letter

The paper reports on a system entry for SemEval-2026 Task 13 that takes the existing mdok approach for machine-generated text and applies it to code by swapping in different base models better suited for code understanding. It covers the task's subtasks on binary detection, source attribution, generator family identification, hybrid human-machine code, and adversarial modifications without introducing new architecture or derivations. That practical adjustment is the main thing it does, and it honestly notes that the margins to the top systems remain significant. The citation pattern to prior mdok work is straightforward and appropriate for this kind of submission. Beyond that, there is little new. The abstract supplies no quantitative scores, no baselines, no ablation results on the model choices, and no error analysis, which leaves the central claim of competitiveness unsupported. Everything is evaluated only on the shared-task data with no external validation or independent test sets described. This makes the results tied to the competition distribution and hard to assess for anyone outside the task. The work is the sort of thing that might interest other participants in the same SemEval task who want a quick summary of one approach tried, but it offers no general insight or reproducible finding that would matter more broadly. I would not bring it to a reading group or cite it. It does not deserve peer review in its current form because the key assertions rest on unshown evidence rather than demonstrated results.

Referee Report

2 major / 1 minor

Summary. The manuscript describes the mcdok submission to SemEval-2026 Task 13 on multi-domain detection of machine-generated code. It adapts the existing mdok approach (originally for text) by fine-tuning LLMs more suitable for code understanding, addressing binary detection, source attribution, generator LLM family detection, and hybrid/adversarially modified code. The central claim is that the submitted systems are competitive in all three subtasks, though with significant margins to the leaders and room for further improvement.

Significance. If the competitiveness claim holds with supporting metrics, the work would show that base-model exploration during fine-tuning can produce effective detectors for machine-generated code without requiring major architectural innovations. This would be a useful data point for the shared task and for practitioners seeking lightweight adaptations of text-detection methods to code. The significance remains limited by the absence of any quantitative evidence or external validation.

major comments (2)

Abstract: The claim that 'the results indicate that the submitted systems are competitive in all three subtasks' is unsupported because the manuscript supplies no quantitative scores (e.g., F1, accuracy), baselines, leaderboard deltas, or error analysis for any subtask.
Abstract: The statement that 'the margins from the top-performing systems are significant' is made without any numerical values or per-subtask breakdown, rendering the competitiveness assessment unverifiable and the call for 'further improvements' unsubstantiated.

minor comments (1)

Abstract: The method description ('exploring various base models more suitable for code understanding') is too high-level; listing the specific models tested and any selection criteria would improve clarity.

Circularity Check

0 steps flagged

No circularity: empirical reporting on shared-task data with no derivation chain

full rationale

The paper is a short system-description submission to a shared task. It describes a straightforward adaptation of an existing mdok-style fine-tuning pipeline by swapping base models for code, then states that the resulting systems 'are competitive in all three subtasks' on the basis of their official task scores. No equations, first-principles derivations, or 'predictions' are offered; the competitiveness claim is simply an empirical summary of leaderboard placement. No self-citation is used to justify any mathematical step, no fitted parameter is relabeled as a prediction, and no uniqueness theorem or ansatz is smuggled in. The evaluation is performed on the task's own test distribution, which is the standard, non-circular protocol for such papers. The derivation chain is therefore empty and self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work consists of empirical fine-tuning of pre-trained LLMs; no free parameters, axioms, or invented entities are introduced beyond standard supervised learning practices.

pith-pipeline@v0.9.0 · 5444 in / 966 out tokens · 29275 ms · 2026-05-09T22:39:34.627923+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

300 extracted references · 150 canonical work pages · 2 internal anchors

[1]

2025 , eprint=

Increasing the Robustness of the Fine-tuned Multilingual Machine-Generated Text Detectors , author=. 2025 , eprint=

2025
[2]

Dominik Macko , year=. mdok of. 2506.01702 , archivePrefix=

work page arXiv
[3]

Working Notes of

Dominik Macko , title =. Working Notes of
[4]

2025 , eprint=

Authorship Attribution in Multilingual Machine-Generated Texts , author=. 2025 , eprint=

2025
[5]

Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke , booktitle =
[6]

2025 , eprint=

Gemma 3 Technical Report , author=. 2025 , eprint=

2025
[7]

CodeGemma Team and Heri Zhao and Jeffrey Hui and Joshua Howland and Nam Nguyen and Siqi Zuo and Andrea Hu and Christopher A. Choquette-Choo and Jingyue Shen and Joe Kelley and Kshitij Bansal and Luke Vilnis and Mateo Wirth and Paul Michel and Peter Choy and Pratik Joshi and Ravin Kumar and Sarmad Hashmi and Shubham Agrawal and Zhitao Gong and Jane Fine an...

work page arXiv
[8]

Qwen2.5-Coder Technical Report

Binyuan Hui and Jian Yang and Zeyu Cui and Jiaxi Yang and Dayiheng Liu and Lei Zhang and Tianyu Liu and Jiajun Zhang and Bowen Yu and Keming Lu and Kai Dang and Yang Fan and Yichang Zhang and An Yang and Rui Men and Fei Huang and Bo Zheng and Yibo Miao and Shanghaoran Quan and Yunlong Feng and Xingzhang Ren and Xuancheng Ren and Jingren Zhou and Junyang L...

work page internal anchor Pith review arXiv
[9]

2025 , eprint=

Qwen3 Technical Report , author=. 2025 , eprint=

2025
[10]

Voight-Kampff

Janek Bevendorff and Yuxia Wang and Jussi Karlgren and Matti Wiegmann and Maik Fr. Overview of the "Voight-Kampff" Generative. Working Notes of
[11]

Droid : A Resource Suite for AI -Generated Code Detection

Orel, Daniil and Paul, Indraneil and Gurevych, Iryna and Nakov, Preslav. Droid : A Resource Suite for AI -Generated Code Detection. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. 2025. doi:10.18653/v1/2025.emnlp-main.1593

work page doi:10.18653/v1/2025.emnlp-main.1593 2025
[12]

S em E val-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios

Orel, Daniil and Paul, Indraneil and Gurevych, Iryna and Nakov, Preslav. S em E val-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios. Proceedings of the 20th International Workshop on Semantic Evaluation (SemEval-2026). 2026

2026
[13]

Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[14]

Research on LLM s-Empowered Conversational AI for Sustainable Behaviour Change

Chen, Ben. Research on LLM s-Empowered Conversational AI for Sustainable Behaviour Change. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[15]

Deep Reinforcement Learning of LLM s using RLHF

Levandovsky, Enoch. Deep Reinforcement Learning of LLM s using RLHF. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[16]

Conversational Collaborative Robots

Kranti, Chalamalasetti. Conversational Collaborative Robots. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[17]

Dialogue System using Large Language Model-based Dynamic Slot Generation

Hashimoto, Ekai. Dialogue System using Large Language Model-based Dynamic Slot Generation. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[18]

Towards Adaptive Human-Agent Collaboration in Real-Time Environments

Nakae, Kaito. Towards Adaptive Human-Agent Collaboration in Real-Time Environments. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[19]

Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation

Jiang, Jingjing. Towards Human-Like Dialogue Systems: Integrating Multimodal Emotion Recognition and Non-Verbal Cue Generation. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[20]

Controlling Dialogue Systems with Graph-Based Structures

Hilgendorf, Laetitia Mina. Controlling Dialogue Systems with Graph-Based Structures. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[21]

Multimodal Agentic Dialogue Systems for Situated Human-Robot Interaction

Sucal, Virgile. Multimodal Agentic Dialogue Systems for Situated Human-Robot Interaction. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[22]

Knowledge Graphs and Representational Models for Dialogue Systems

Walker, Nicholas Thomas. Knowledge Graphs and Representational Models for Dialogue Systems. Proceedings of the 21st Workshop of Young Researchers' Roundtable on Spoken Dialogue Systems. 2025

2025
[23]

Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.0

work page doi:10.18653/v1/2025.xllm-1.0 2025
[24]

Fine-Tuning Large Language Models for Relation Extraction within a Retrieval-Augmented Generation Framework

Efeoglu, Sefika and Paschke, Adrian. Fine-Tuning Large Language Models for Relation Extraction within a Retrieval-Augmented Generation Framework. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.1

work page doi:10.18653/v1/2025.xllm-1.1 2025
[25]

Benchmarking Table Extraction: Multimodal LLM s vs Traditional OCR

Nunes, Guilherme and Rolla, Vitor and Pereira, Duarte and Alves, Vasco and Carreiro, Andre and Baptista, M \'a rcia. Benchmarking Table Extraction: Multimodal LLM s vs Traditional OCR. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.2

work page doi:10.18653/v1/2025.xllm-1.2 2025
[26]

Injecting Structured Knowledge into LLM s via Graph Neural Networks

Li, Zichao and Ke, Zong and Zhao, Puning. Injecting Structured Knowledge into LLM s via Graph Neural Networks. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.3

work page doi:10.18653/v1/2025.xllm-1.3 2025
[27]

Regular-pattern-sensitive CRF s for Distant Label Interactions

Papay, Sean and Klinger, Roman and Pad \'o , Sebastian. Regular-pattern-sensitive CRF s for Distant Label Interactions. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.4

work page doi:10.18653/v1/2025.xllm-1.4 2025
[28]

From Syntax to Semantics: Evaluating the Impact of Linguistic Structures on LLM -Based Information Extraction

Swarup, Anushka and Bhandarkar, Avanti and Wilson, Ronald and Pan, Tianyu and Woodard, Damon. From Syntax to Semantics: Evaluating the Impact of Linguistic Structures on LLM -Based Information Extraction. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.5

work page doi:10.18653/v1/2025.xllm-1.5 2025
[29]

Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models

Willemsen, Bram and Skantze, Gabriel. Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.6

work page doi:10.18653/v1/2025.xllm-1.6 2025
[30]

Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis

Li, Daoyang and Zhao, Haiyan and Zeng, Qingcheng and Du, Mengnan. Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.7

work page doi:10.18653/v1/2025.xllm-1.7 2025
[31]

Self-Contrastive Loop of Thought Method for Text-to- SQL Based on Large Language Model

Kang, Fengrui and Tan, Mingxi and Huang, Xianying and Yang, Shiju. Self-Contrastive Loop of Thought Method for Text-to- SQL Based on Large Language Model. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.8

work page doi:10.18653/v1/2025.xllm-1.8 2025
[32]

Combining Automated and Manual Data for Effective Downstream Fine-Tuning of Transformers for Low-Resource Language Applications

Isaeva, Ulyana and Astafurov, Danil and Martynov, Nikita. Combining Automated and Manual Data for Effective Downstream Fine-Tuning of Transformers for Low-Resource Language Applications. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.9

work page doi:10.18653/v1/2025.xllm-1.9 2025
[33]

Seamlessly Integrating Tree-Based Positional Embeddings into Transformer Models for Source Code Representation

Bartkowiak, Patryk and Grali \'n ski, Filip. Seamlessly Integrating Tree-Based Positional Embeddings into Transformer Models for Source Code Representation. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.10

work page doi:10.18653/v1/2025.xllm-1.10 2025
[34]

Enhancing AMR Parsing with Group Relative Policy Optimization

Barta, Botond and Hamerlik, Endre and Nyist, Mil \'a n and Ito, Masato and Acs, Judit. Enhancing AMR Parsing with Group Relative Policy Optimization. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.11

work page doi:10.18653/v1/2025.xllm-1.11 2025
[35]

Structure Modeling Approach for UD Parsing of Historical M odern J apanese

Ozaki, Hiroaki and Omura, Mai and Komiya, Kanako and Asahara, Masayuki and Ogiso, Toshinobu. Structure Modeling Approach for UD Parsing of Historical M odern J apanese. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.12

work page doi:10.18653/v1/2025.xllm-1.12 2025
[36]

BARTABSA ++: Revisiting BARTABSA with Decoder LLM s

Pfister, Jan and V. BARTABSA ++: Revisiting BARTABSA with Decoder LLM s. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.13

work page doi:10.18653/v1/2025.xllm-1.13 2025
[37]

Typed- RAG : Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation

Lee, DongGeon and Park, Ahjeong and Lee, Hyeri and Nam, Hyeonseo and Maeng, Yunho. Typed- RAG : Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.14

work page doi:10.18653/v1/2025.xllm-1.14 2025
[38]

Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction

Hellwig, Nils Constantin and Fehle, Jakob and Kruschwitz, Udo and Wolff, Christian. Do we still need Human Annotators? Prompting Large Language Models for Aspect Sentiment Quad Prediction. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.15

work page doi:10.18653/v1/2025.xllm-1.15 2025
[39]

Can LLM s Interpret and Leverage Structured Linguistic Representations? A Case Study with AMR s

Raut, Ankush and Zhu, Xiaofeng and Pacheco, Maria Leonor. Can LLM s Interpret and Leverage Structured Linguistic Representations? A Case Study with AMR s. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.16

work page internal anchor Pith review doi:10.18653/v1/2025.xllm-1.16 2025
[40]

LLM Dependency Parsing with In-Context Rules

Ginn, Michael and Palmer, Alexis. LLM Dependency Parsing with In-Context Rules. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.17

work page doi:10.18653/v1/2025.xllm-1.17 2025
[41]

Cognitive Mirroring for D oc RE : A Self-Supervised Iterative Reflection Framework with Triplet-Centric Explicit and Implicit Feedback

Han, Xu and Wang, Bo and Sun, Yueheng and Zhao, Dongming and Qu, Zongfeng and He, Ruifang and Hou, Yuexian and Hu, Qinghua. Cognitive Mirroring for D oc RE : A Self-Supervised Iterative Reflection Framework with Triplet-Centric Explicit and Implicit Feedback. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)...

work page doi:10.18653/v1/2025.xllm-1.18 2025
[42]

Cross-Document Event-Keyed Summarization

Walden, William and Kuchmiichuk, Pavlo and Martin, Alexander and Jin, Chihsheng and Cao, Angela and Sun, Claire and Allen, Curisia and White, Aaron. Cross-Document Event-Keyed Summarization. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.19

work page doi:10.18653/v1/2025.xllm-1.19 2025
[43]

Transfer of Structural Knowledge from Synthetic Languages

Budnikov, Mikhail and Yamshchikov, Ivan. Transfer of Structural Knowledge from Synthetic Languages. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.20

work page doi:10.18653/v1/2025.xllm-1.20 2025
[44]

Language Models are Universal Embedders

Zhang, Xin and Li, Zehan and Zhang, Yanzhao and Long, Dingkun and Xie, Pengjun and Zhang, Meishan and Zhang, Min. Language Models are Universal Embedders. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.21

work page doi:10.18653/v1/2025.xllm-1.21 2025
[45]

D ia DP @ XLLM 25: Advancing C hinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring

Duan, Shuoqiu and Chen, Xiaoliang and Miao, Duoqian and Gu, Xu and Li, Xianyong and Du, Yajun. D ia DP @ XLLM 25: Advancing C hinese Dialogue Parsing via Unified Pretrained Language Models and Biaffine Dependency Scoring. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.22

work page doi:10.18653/v1/2025.xllm-1.22 2025
[46]

LLMSR @ XLLM 25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation

Yuan, Jiahao and Sun, Xingzhe and Yu, Xing and Wang, Jingwen and Du, Dehui and Cui, Zhiqing and Di, Zixiang. LLMSR @ XLLM 25: Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.23

work page doi:10.18653/v1/2025.xllm-1.23 2025
[47]

S peech EE @ XLLM 25: End-to-End Structured Event Extraction from Speech

Chaudhuri, Soham and Biswas, Diganta and Saha, Dipanjan and Das, Dipankar and Bandyopadhyay, Sivaji. S peech EE @ XLLM 25: End-to-End Structured Event Extraction from Speech. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.24

work page doi:10.18653/v1/2025.xllm-1.24 2025
[48]

Luu, Son and Van Nguyen, Kiet

Pham Hoang Le, Nguyen and Dinh Thien, An and T. Luu, Son and Van Nguyen, Kiet. D oc IE @ XLLM 25: Z ero S emble - Robust and Efficient Zero-Shot Document Information Extraction with Heterogeneous Large Language Model Ensembles. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.25

work page doi:10.18653/v1/2025.xllm-1.25 2025
[49]

D oc IE @ XLLM 25: In-Context Learning for Information Extraction using Fully Synthetic Demonstrations

Popovic, Nicholas and Kangen, Ashish and Schopf, Tim and F. D oc IE @ XLLM 25: In-Context Learning for Information Extraction using Fully Synthetic Demonstrations. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.26

work page doi:10.18653/v1/2025.xllm-1.26 2025
[50]

LLMSR @ XLLM 25: Integrating Reasoning Prompt Strategies with Structural Prompt Formats for Enhanced Logical Inference

Tai, Le and Van, Thin. LLMSR @ XLLM 25: Integrating Reasoning Prompt Strategies with Structural Prompt Formats for Enhanced Logical Inference. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.27

work page doi:10.18653/v1/2025.xllm-1.27 2025
[51]

D oc IE @ XLLM 25: UIEP rompter: A Unified Training-Free Framework for universal document-level information extraction via Structured Prompt

Qiu, Chengfeng and Zhou, Lifeng and Wei, Kaifeng and Li, Yuke. D oc IE @ XLLM 25: UIEP rompter: A Unified Training-Free Framework for universal document-level information extraction via Structured Prompt. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.28

work page doi:10.18653/v1/2025.xllm-1.28 2025
[52]

LLMSR @ XLLM 25: SWRV : Empowering Self-Verification of Small Language Models through Step-wise Reasoning and Verification

Chen, Danchun. LLMSR @ XLLM 25: SWRV : Empowering Self-Verification of Small Language Models through Step-wise Reasoning and Verification. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.29

work page doi:10.18653/v1/2025.xllm-1.29 2025
[53]

LLMSR @ XLLM 25: An Empirical Study of LLM for Structural Reasoning

Li, Xinye and Wan, Mingqi and Sui, Dianbo. LLMSR @ XLLM 25: An Empirical Study of LLM for Structural Reasoning. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.30

work page doi:10.18653/v1/2025.xllm-1.30 2025
[54]

LLMSR @ XLLM 25: A Language Model-Based Pipeline for Structured Reasoning Data Construction

Xing, Hongrui and Liu, Xinzhang and Jiang, Zhuo and Yang, Zhihao and Yao, Yitong and Wang, Zihan and Deng, Wenmin and Wang, Chao and Song, Shuangyong and Yang, Wang and He, Zhongjiang and Li, Yongxiang. LLMSR @ XLLM 25: A Language Model-Based Pipeline for Structured Reasoning Data Construction. Proceedings of the 1st Joint Workshop on Large Language Model...

work page doi:10.18653/v1/2025.xllm-1.31 2025
[55]

S peech EE @ XLLM 25: Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction

Gedeon, M \'a t \'e. S peech EE @ XLLM 25: Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction. Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025). 2025. doi:10.18653/v1/2025.xllm-1.32

work page doi:10.18653/v1/2025.xllm-1.32 2025
[56]

Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[57]

An introduction to computational identification and classification of Upam \= a alaṇk \= a ra

Jadhav, Bhakti and Dutta, Himanshu and Kanitkar, Shruti and Kulkarni, Malhar and Bhattacharyya, Pushpak. An introduction to computational identification and classification of Upam \= a alaṇk \= a ra. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[58]

Aesthetics of S anskrit Poetry from the Perspective of Computational Linguistics: A Case Study Analysis on \'S ikṣ \= a ṣṭaka

Sandhan, Jivnesh and Barbadikar, Amruta and Maity, Malay and Satuluri, Pavankumar and Sandhan, Tushar and Gupta, Ravi M and Goyal, Pawan and Behera, Laxmidhar. Aesthetics of S anskrit Poetry from the Perspective of Computational Linguistics: A Case Study Analysis on \'S ikṣ \= a ṣṭaka. Computational Sanskrit and Digital Humanities - World Sanskrit Confere...

2025
[59]

Itaretara Dvandva: A challenge for Dependency Tree semantics

Kulkarni, Amba and Neelamana, Vasudha. Itaretara Dvandva: A challenge for Dependency Tree semantics. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[60]

A Case Study of Handwritten Text Recognition from Pre-Colonial era S anskrit Manuscripts

Chincholikar, Kartik and Dwivedi, Shagun and Gopalan, Kaushik and Awasthi, Tarinee. A Case Study of Handwritten Text Recognition from Pre-Colonial era S anskrit Manuscripts. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[61]

Towards Accent-Aware V edic S anskrit Optical Character Recognition Based on Transformer Models

Tsukagoshi, Yuzuki and Kuroiwa, Ryo and Ohmukai, Ikki. Towards Accent-Aware V edic S anskrit Optical Character Recognition Based on Transformer Models. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[62]

Vedavani: A Benchmark Corpus for ASR on V edic S anskrit Poetry

Kumar, Sujeet and Ray, Pretam and Beerukuri, Abhinay and Kamoji, Shrey and Jagadeeshan, Manoj Balaji and Goyal, Pawan. Vedavani: A Benchmark Corpus for ASR on V edic S anskrit Poetry. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[63]

Compound Type Identification in S anskrit

Krishnan, Sriram and Satuluri, Pavankumar and Barbadikar, Amruta and Prasanna Venkatesh, T S and Kulkarni, Amba. Compound Type Identification in S anskrit. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[64]

IKML : A Markup Language for Collaborative Semantic Annotation of I ndic Texts

Lakkundi, Chaitanya S and Rajaraman, Gopalakrishnan and Susarla, Sai Rama Krishna. IKML : A Markup Language for Collaborative Semantic Annotation of I ndic Texts. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[65]

Challenges in Processing V edic S anskrit: Towards creating a normalized dataset for the Ṛgveda-saṃhit \= a

Krishnan, Sriram and Gayathri, Sepuri and Kulkarni, Amba. Challenges in Processing V edic S anskrit: Towards creating a normalized dataset for the Ṛgveda-saṃhit \= a. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[66]

P \= a ṇḍitya: Visualizing S anskrit Intellectual Networks

Neill, Tyler. P \= a ṇḍitya: Visualizing S anskrit Intellectual Networks. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[67]

Anveshana: A New Benchmark Dataset for Cross-Lingual Information Retrieval on E nglish Queries and S anskrit Documents

Jagadeeshan, Manoj Balaji and Raj, Prince and Goyal, Pawan. Anveshana: A New Benchmark Dataset for Cross-Lingual Information Retrieval on E nglish Queries and S anskrit Documents. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[68]

Concordance of S anskrit Synonyms

Patel, Dhaval. Concordance of S anskrit Synonyms. Computational Sanskrit and Digital Humanities - World Sanskrit Conference 2025. 2025

2025
[69]

Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[70]

Chain-of- M eta W riting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts

Buhnila, Ioana and Cislaru, Georgeta and Todirascu, Amalia. Chain-of- M eta W riting: Linguistic and Textual Analysis of How Small Language Models Write Young Students Texts. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[71]

Semantic Masking in a Needle-in-a-haystack Test for Evaluating Large Language Model Long-Text Capabilities

Shi, Ken and Penn, Gerald. Semantic Masking in a Needle-in-a-haystack Test for Evaluating Large Language Model Long-Text Capabilities. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[72]

Reading Between the Lines: A dataset and a study on why some texts are tougher than others

Khallaf, Nouran and Eugeni, Carlo and Sharoff, Serge. Reading Between the Lines: A dataset and a study on why some texts are tougher than others. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[73]

P ara R ev : Building a dataset for Scientific Paragraph Revision annotated with revision instruction

Jourdan, L \'e ane and Boudin, Florian and Dufour, Richard and Hernandez, Nicolas and Aizawa, Akiko. P ara R ev : Building a dataset for Scientific Paragraph Revision annotated with revision instruction. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[74]

Towards an operative definition of creative writing: a preliminary assessment of creativeness in AI and human texts

Maggi, Chiara and Vitaletti, Andrea. Towards an operative definition of creative writing: a preliminary assessment of creativeness in AI and human texts. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[75]

Decoding Semantic Representations in the Brain Under Language Stimuli with Large Language Models

Sato, Anna and Kobayashi, Ichiro. Decoding Semantic Representations in the Brain Under Language Stimuli with Large Language Models. Proceedings of the First Workshop on Writing Aids at the Crossroads of AI, Cognitive Science and NLP (WRAICOGS 2025). 2025

2025
[76]

Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH). 2025

2025
[77]

A Comprehensive Taxonomy of Bias Mitigation Methods for Hate Speech Detection

Fillies, Jan and Wawerek, Marius and Paschke, Adrian. A Comprehensive Taxonomy of Bias Mitigation Methods for Hate Speech Detection. Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH). 2025

2025
[78]

Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation

Antypas, Dimosthenis and Sen, Indira and Perez Almendros, Carla and Camacho-Collados, Jose and Barbieri, Francesco. Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation. Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH). 2025

2025
[79]

From civility to parity: Marxist-feminist ethics for context-aware algorithmic content moderation

Oh, Dayei. From civility to parity: Marxist-feminist ethics for context-aware algorithmic content moderation. Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH). 2025

2025
[80]

A Novel Dataset for Classifying G erman Hate Speech Comments with Criminal Relevance

Kums, Vincent and Meyer, Florian and Pivit, Luisa and Vedenina, Uliana and Wortmann, Jonas and Siegel, Melanie and Labudde, Dirk. A Novel Dataset for Classifying G erman Hate Speech Comments with Criminal Relevance. Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH). 2025

2025

Showing first 80 references.