Self-training restructures language by amplifying surface markers and collapsing deep syntax according to structural depth rather than frequency, as evidenced by correlations across multiple models and a human fine-tuning control.
Preprint
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 3years
2026 3representative citing papers
AI-generated text detectors achieve high benchmark accuracy by exploiting unstable dataset-specific linguistic features, as evidenced by cross-domain degradation and differing SHAP explanations across corpora.
Genre and model exert stronger influence on writing style than human/LLM source or decoding strategy in a broad comparison of lexicogrammatical features.
citing papers explorer
-
Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies
Self-training restructures language by amplifying surface markers and collapsing deep syntax according to structural depth rather than frequency, as evidenced by correlations across multiple models and a human fine-tuning control.
-
Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy
AI-generated text detectors achieve high benchmark accuracy by exploiting unstable dataset-specific linguistic features, as evidenced by cross-domain degradation and differing SHAP explanations across corpora.
-
Interpretable Stylistic Variation in Human and LLM Writing Across Genres, Models, and Decoding Strategies
Genre and model exert stronger influence on writing style than human/LLM source or decoding strategy in a broad comparison of lexicogrammatical features.