Recognition: no theorem link
UTS at PsyDefDetect: Multi-Agent Councils and Absence-Based Reasoning for Defense Mechanism Classification
Pith reviewed 2026-05-13 05:55 UTC · model grok-4.3
The pith
Defense mechanisms are classified by what they omit in affect and cognition using a council of agents that rate evidence strength instead of voting.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Defense mechanisms are identified by what is absent rather than present: missing affect, blocked cognition, or denied reality. This absence is encoded as an affect-cognition integration spectrum in clinical rules. These rules are applied inside a multi-phase deliberative council of Gemini 2.5 agents in which class-specific advocates rate the strength of evidence for each mechanism instead of voting, yielding effective classification in dialogues.
What carries the argument
The multi-phase deliberative council of class-specific advocates that rate evidence strength on an affect-cognition integration spectrum rather than voting
Load-bearing premise
The hand-crafted affect-cognition integration rules accurately capture the definitions of defense mechanisms without introducing systematic bias, and the 16 overrides chosen by the builder-critic-regression system generalize beyond the test set used.
What would settle it
Applying the full council system to a new collection of emotional support dialogues and measuring whether the minority-class predictions align with independent clinical labels at the same rate as on the original data would confirm or refute the central claim.
Figures
read the original abstract
This paper describes our system for classifying psychological defense mechanisms in emotional support dialogues using the Defense Mechanism Rating Scales (DMRS), placing second (F1 0.406) among 64 teams. A central insight is that defense mechanisms are defined by what is absent: missing affect, blocked cognition, denied reality. We encode this as an affect-cognition integration spectrum in prompt-level clinical rules, which account for the largest single gain (+11.4pp F1). Our architecture is a multi-phase deliberative council of Gemini 2.5 agents where class-specific advocates rate evidence strength rather than voting, achieving F1 0.382 with no fine-tuning - a top-5 result on its own. We find, however, that the council is confidently wrong about minority classes: 59-80% of stable minority predictions are incorrect, driven by a systematic "L7 attractor" in which emotional content defaults to the majority class. A targeted override ensemble from three fine-tuned Qwen3.5 models applies 16 overrides (+2.4pp), selected by a structured multi-agent system (builder, critic, regression guard) that produced a larger F1 gain in one iteration than 8 prior attempts combined.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a system for classifying psychological defense mechanisms in emotional support dialogues using the DMRS, achieving second place (F1 0.406) among 64 teams. It encodes defense mechanisms via absence-based reasoning in an affect-cognition integration spectrum of prompt-level clinical rules (+11.4pp F1 gain), deploys a multi-phase deliberative council of Gemini 2.5 agents where class-specific advocates rate evidence strength rather than vote (F1 0.382 with no fine-tuning, top-5 result), and applies 16 targeted overrides from three fine-tuned Qwen3.5 models (+2.4pp) selected by a builder-critic-regression guard multi-agent system. The work identifies a systematic L7 attractor causing 59-80% error on minority classes.
Significance. If the gains generalize, the work shows that absence-based clinical rules and evidence-rating deliberative councils can deliver competitive performance on imbalanced, nuanced classification without fine-tuning the primary model. Explicit component-wise gains and the identification of the L7 attractor provide concrete, falsifiable insights for multi-agent design in domain-specific tasks. The no-fine-tuning council result is a clear strength.
major comments (2)
- [Methods section on override ensemble and results decomposition] The description of the 16 overrides (selected via the builder-critic-regression guard system) does not specify whether the iterative selection and regression guard operated on a strict held-out development set or had access to test-set labels/performance during tuning. This is load-bearing for the central claim because the +2.4pp gain is presented as a key contribution on top of the already-competitive council, yet the skeptic note and abstract together indicate post-hoc tuning on competition data; without explicit confirmation of independence from the test metric, the reported generalization is not secured.
- [Results and analysis (including abstract)] No error bars, statistical significance tests, or full ablation tables are provided for the component contributions or minority-class performance. This undermines confidence in the +11.4pp and +2.4pp deltas and the claim that the overrides fix the 59-80% minority error rate, especially given the noted L7 attractor.
minor comments (1)
- [Analysis of council behavior] The term 'L7 attractor' is introduced without a precise definition or example in the main text; a short illustrative dialogue snippet would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and for recognizing the value of our no-fine-tuning council result and the L7 attractor analysis. We address each major comment below with clarifications and commit to targeted revisions that strengthen methodological transparency and statistical support without altering the core claims.
read point-by-point responses
-
Referee: The description of the 16 overrides (selected via the builder-critic-regression guard system) does not specify whether the iterative selection and regression guard operated on a strict held-out development set or had access to test-set labels/performance during tuning. This is load-bearing for the central claim because the +2.4pp gain is presented as a key contribution on top of the already-competitive council, yet the skeptic note and abstract together indicate post-hoc tuning on competition data; without explicit confirmation of independence from the test metric, the reported generalization is not secured.
Authors: We acknowledge the importance of this distinction for securing the generalization claim. The final selection of the 16 overrides and the regression guard operated exclusively on a held-out development set constructed from the training data; test labels were never accessed during iterative tuning or guard application. The skeptic note refers only to preliminary exploratory runs that were discarded before locking the override set. We will revise the Methods section to include an explicit data-split diagram, pseudocode for the builder-critic-regression pipeline, and a clear statement confirming test-set independence. This revision will be accompanied by the exact development-set size and selection criteria. revision: yes
-
Referee: No error bars, statistical significance tests, or full ablation tables are provided for the component contributions or minority-class performance. This undermines confidence in the +11.4pp and +2.4pp deltas and the claim that the overrides fix the 59-80% minority error rate, especially given the noted L7 attractor.
Authors: We agree that the lack of error bars and formal significance testing reduces confidence in the reported deltas. In the revised manuscript we will add: bootstrap-derived 95% confidence intervals (1,000 resamples) for all F1 scores; McNemar’s tests for the significance of the +11.4pp (rules) and +2.4pp (overrides) gains; a complete ablation table with incremental contributions and separate minority-class F1 columns; and expanded quantitative analysis of the L7 attractor, including its error rates before and after overrides across multiple runs. These additions will directly support the component-wise claims and the minority-class correction narrative. revision: yes
Circularity Check
No significant circularity; derivation remains self-contained against held-out data
full rationale
The paper's central claims rest on a multi-agent council architecture and hand-crafted clinical rules derived from DMRS definitions of absence, with final performance reported on held-out competition data. Override selection via the builder-critic-regression guard is presented as an internal system step that improved F1, but the provided text does not contain equations, self-definitions, or descriptions showing that test labels were used during selection or that the gain reduces to a fit by construction. No self-citation chains, uniqueness theorems, or ansatz smuggling appear in the abstract or described sections. The result is externally grounded by the competition evaluation, satisfying the criteria for a non-circular finding.
Axiom & Free-Parameter Ledger
free parameters (2)
- Affect-cognition integration spectrum rules
- 16 targeted overrides
axioms (1)
- domain assumption Defense mechanisms are defined by absences (missing affect, blocked cognition, denied reality).
Reference graph
Works this paper leans on
-
[1]
Na, Hongbin and Wang, Zimu and Chen, Zhaoming and Hua, Yining and Gao, Rena and Yang, Kailai and Chen, Ling and Wang, Wei and Ji, Shaoxiong and Torous, John and Ananiadou, Sophia , booktitle=. Overview of the. 2026 , address=
work page 2026
-
[2]
Findings of the Association for Computational Linguistics: ACL 2026 , month=jul, year=
You Never Know a Person, You Only Know Their Defenses: Detecting Levels of Psychological Defense Mechanisms in Supportive Conversations , author=. Findings of the Association for Computational Linguistics: ACL 2026 , month=jul, year=
work page 2026
-
[3]
A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions
A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions , author=. Findings of the Association for Computational Linguistics: ACL 2025 , month=jul, year=. doi:10.18653/v1/2025.findings-acl.385 , pages=
-
[4]
Perry, J. Christopher , institution=. Defense. 1990 , edition=
work page 1990
-
[5]
Self-Consistency Improves Chain of Thought Reasoning in Language Models , author=. Proceedings of ICLR , year=
-
[6]
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Improving Factuality and Reasoning in Language Models through Multiagent Debate , author=. arXiv preprint arXiv:2305.14325 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
Can Generalist Foundation Models Outcompete Special-Purpose Tuning?
Nori, Harsha and Lee, Yin Tat and Zhang, Sheng and Carignan, Dean and Edgar, Richard and Fusi, Nicolo and King, Nicholas and Larson, Jonathan and Li, Yuanzhi and Liu, Weishung and others , journal=. Can Generalist Foundation Models Outcompete Special-Purpose Tuning?
-
[8]
Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shanen and Wang, Lu and Chen, Weizhu , journal=
-
[9]
Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke , journal=
-
[10]
Gemini: A Family of Highly Capable Multimodal Models
Gemini: A Family of Highly Capable Multimodal Models , author=. arXiv preprint arXiv:2312.11805 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
2026 , howpublished=
work page 2026
- [12]
- [13]
-
[14]
Scikit-learn: Machine Learning in
Pedregosa, Fabian and Varoquaux, Ga. Scikit-learn: Machine Learning in. Journal of Machine Learning Research , volume=
-
[15]
Unsloth: Fast Language Model Fine-tuning , author=. 2024 , howpublished=
work page 2024
-
[16]
von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan , year=
-
[17]
Energy and Policy Considerations for Deep Learning in
Strubell, Emma and Ganesh, Ananya and McCallum, Andrew , booktitle=. Energy and Policy Considerations for Deep Learning in
- [18]
-
[19]
Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo. Proceedings of EACL , pages=
-
[20]
Lee, Chankyu and Roy, Rajarshi and Xu, Menber and Raiman, Jonathan and Shoeybi, Mohammad and Catanzaro, Bryan and Han, Wei , journal=
-
[21]
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Text Embeddings by Weakly-Supervised Contrastive Pre-training , author=. arXiv preprint arXiv:2212.03533 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[22]
Irving, Geoffrey and Christiano, Paul and Amodei, Dario , journal=
-
[23]
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems , author=. arXiv preprint arXiv:2110.14168 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[24]
Proceedings of NeurIPS , year=
Self-Refine: Iterative Refinement with Self-Feedback , author=. Proceedings of NeurIPS , year=
-
[25]
Retrieval-Augmented Generation for Knowledge-Intensive
Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K. Retrieval-Augmented Generation for Knowledge-Intensive. Advances in Neural Information Processing Systems , volume=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.