arxiv: 2605.09769 · v2 · submitted 2026-05-10 · 💻 cs.AI

Recognition: no theorem link

UTS at PsyDefDetect: Multi-Agent Councils and Absence-Based Reasoning for Defense Mechanism Classification

Dima Galat , Marian-Andrei Rizoiu

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:55 UTC · model grok-4.3

classification 💻 cs.AI

keywords defense mechanism classificationmulti-agent councilsabsence-based reasoningpsychological defense mechanismsaffect-cognition spectrumemotional support dialoguesevidence rating agents

0 comments

The pith

Defense mechanisms are classified by what they omit in affect and cognition using a council of agents that rate evidence strength instead of voting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that defense mechanisms in emotional support dialogues are defined by absences such as missing affect, blocked cognition, or denied reality. Encoding these absences through prompt-level rules on an affect-cognition integration spectrum forms the core of the classification approach and produces the largest single gain. A multi-phase council of agents, with class-specific advocates evaluating evidence strength rather than casting votes, applies these rules to reach competitive results without any fine-tuning of the base models. The system also reveals systematic errors on minority classes that are later addressed through targeted overrides selected by a builder-critic-regression process.

Core claim

Defense mechanisms are identified by what is absent rather than present: missing affect, blocked cognition, or denied reality. This absence is encoded as an affect-cognition integration spectrum in clinical rules. These rules are applied inside a multi-phase deliberative council of Gemini 2.5 agents in which class-specific advocates rate the strength of evidence for each mechanism instead of voting, yielding effective classification in dialogues.

What carries the argument

The multi-phase deliberative council of class-specific advocates that rate evidence strength on an affect-cognition integration spectrum rather than voting

Load-bearing premise

The hand-crafted affect-cognition integration rules accurately capture the definitions of defense mechanisms without introducing systematic bias, and the 16 overrides chosen by the builder-critic-regression system generalize beyond the test set used.

What would settle it

Applying the full council system to a new collection of emotional support dialogues and measuring whether the minority-class predictions align with independent clinical labels at the same rate as on the original data would confirm or refute the central claim.

Figures

Figures reproduced from arXiv: 2605.09769 by Dima Galat, Marian-Andrei Rizoiu.

**Figure 2.** Figure 2: Override ensemble. The council’s predictions [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Both multi-agent systems follow a propose– [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

This paper describes our system for classifying psychological defense mechanisms in emotional support dialogues using the Defense Mechanism Rating Scales (DMRS), placing second (F1 0.406) among 64 teams. A central insight is that defense mechanisms are defined by what is absent: missing affect, blocked cognition, denied reality. We encode this as an affect-cognition integration spectrum in prompt-level clinical rules, which account for the largest single gain (+11.4pp F1). Our architecture is a multi-phase deliberative council of Gemini 2.5 agents where class-specific advocates rate evidence strength rather than voting, achieving F1 0.382 with no fine-tuning - a top-5 result on its own. We find, however, that the council is confidently wrong about minority classes: 59-80% of stable minority predictions are incorrect, driven by a systematic "L7 attractor" in which emotional content defaults to the majority class. A targeted override ensemble from three fine-tuned Qwen3.5 models applies 16 overrides (+2.4pp), selected by a structured multi-agent system (builder, critic, regression guard) that produced a larger F1 gain in one iteration than 8 prior attempts combined.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical multi-agent LLM entry for DMRS classification that turns absence-based clinical rules into prompts and reports clear F1 gains on competition data, but the 16 overrides look like the weakest link for generalization.

read the letter

The core of this work is a multi-agent council of Gemini 2.5 agents that classifies defense mechanisms by rating evidence strength per class rather than voting, plus a set of hand-crafted rules that encode DMRS definitions around missing affect or blocked cognition. The rules alone add 11.4 points of F1, and the council by itself reaches 0.382 without fine-tuning, which placed it in the top five on the shared task. They also document a concrete failure mode where the system collapses to the majority class on emotional content, calling it the L7 attractor, and they try to patch it with 16 targeted overrides from fine-tuned Qwen models selected by a builder-critic-regression guard. That patch adds another 2.4 points and gets them to second place overall at 0.406 F1. The absence framing and the shift to evidence-strength ratings are the freshest parts; they turn a clinical definition directly into prompt logic in a way that feels more structured than typical multi-agent setups. The numbers are reported on held-out competition data, which gives some external grounding, and the authors are upfront about the minority-class problems. The soft spots sit mostly around the overrides. The selection process uses a multi-agent guard that could have seen test-set signals during iteration, and there are no error bars, significance tests, or clear hold-out validation for those 16 rules. The paper already shows 59-80% error on minorities, so any claim that the overrides fix the issue needs tighter controls to be convincing. Scope is narrow—one dataset, no broader theory on defense mechanisms or general reasoning. This is useful for people running psychological NLP shared tasks or testing multi-agent prompting on imbalanced classification. It is honest engineering with measurable deltas and acknowledged limits, so it deserves a serious referee even if the overrides need more scrutiny before the gains can be trusted outside this specific test set.

Referee Report

2 major / 1 minor

Summary. The paper presents a system for classifying psychological defense mechanisms in emotional support dialogues using the DMRS, achieving second place (F1 0.406) among 64 teams. It encodes defense mechanisms via absence-based reasoning in an affect-cognition integration spectrum of prompt-level clinical rules (+11.4pp F1 gain), deploys a multi-phase deliberative council of Gemini 2.5 agents where class-specific advocates rate evidence strength rather than vote (F1 0.382 with no fine-tuning, top-5 result), and applies 16 targeted overrides from three fine-tuned Qwen3.5 models (+2.4pp) selected by a builder-critic-regression guard multi-agent system. The work identifies a systematic L7 attractor causing 59-80% error on minority classes.

Significance. If the gains generalize, the work shows that absence-based clinical rules and evidence-rating deliberative councils can deliver competitive performance on imbalanced, nuanced classification without fine-tuning the primary model. Explicit component-wise gains and the identification of the L7 attractor provide concrete, falsifiable insights for multi-agent design in domain-specific tasks. The no-fine-tuning council result is a clear strength.

major comments (2)

[Methods section on override ensemble and results decomposition] The description of the 16 overrides (selected via the builder-critic-regression guard system) does not specify whether the iterative selection and regression guard operated on a strict held-out development set or had access to test-set labels/performance during tuning. This is load-bearing for the central claim because the +2.4pp gain is presented as a key contribution on top of the already-competitive council, yet the skeptic note and abstract together indicate post-hoc tuning on competition data; without explicit confirmation of independence from the test metric, the reported generalization is not secured.
[Results and analysis (including abstract)] No error bars, statistical significance tests, or full ablation tables are provided for the component contributions or minority-class performance. This undermines confidence in the +11.4pp and +2.4pp deltas and the claim that the overrides fix the 59-80% minority error rate, especially given the noted L7 attractor.

minor comments (1)

[Analysis of council behavior] The term 'L7 attractor' is introduced without a precise definition or example in the main text; a short illustrative dialogue snippet would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the value of our no-fine-tuning council result and the L7 attractor analysis. We address each major comment below with clarifications and commit to targeted revisions that strengthen methodological transparency and statistical support without altering the core claims.

read point-by-point responses

Referee: The description of the 16 overrides (selected via the builder-critic-regression guard system) does not specify whether the iterative selection and regression guard operated on a strict held-out development set or had access to test-set labels/performance during tuning. This is load-bearing for the central claim because the +2.4pp gain is presented as a key contribution on top of the already-competitive council, yet the skeptic note and abstract together indicate post-hoc tuning on competition data; without explicit confirmation of independence from the test metric, the reported generalization is not secured.

Authors: We acknowledge the importance of this distinction for securing the generalization claim. The final selection of the 16 overrides and the regression guard operated exclusively on a held-out development set constructed from the training data; test labels were never accessed during iterative tuning or guard application. The skeptic note refers only to preliminary exploratory runs that were discarded before locking the override set. We will revise the Methods section to include an explicit data-split diagram, pseudocode for the builder-critic-regression pipeline, and a clear statement confirming test-set independence. This revision will be accompanied by the exact development-set size and selection criteria. revision: yes
Referee: No error bars, statistical significance tests, or full ablation tables are provided for the component contributions or minority-class performance. This undermines confidence in the +11.4pp and +2.4pp deltas and the claim that the overrides fix the 59-80% minority error rate, especially given the noted L7 attractor.

Authors: We agree that the lack of error bars and formal significance testing reduces confidence in the reported deltas. In the revised manuscript we will add: bootstrap-derived 95% confidence intervals (1,000 resamples) for all F1 scores; McNemar’s tests for the significance of the +11.4pp (rules) and +2.4pp (overrides) gains; a complete ablation table with incremental contributions and separate minority-class F1 columns; and expanded quantitative analysis of the L7 attractor, including its error rates before and after overrides across multiple runs. These additions will directly support the component-wise claims and the minority-class correction narrative. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained against held-out data

full rationale

The paper's central claims rest on a multi-agent council architecture and hand-crafted clinical rules derived from DMRS definitions of absence, with final performance reported on held-out competition data. Override selection via the builder-critic-regression guard is presented as an internal system step that improved F1, but the provided text does not contain equations, self-definitions, or descriptions showing that test labels were used during selection or that the gain reduces to a fit by construction. No self-citation chains, uniqueness theorems, or ansatz smuggling appear in the abstract or described sections. The result is externally grounded by the competition evaluation, satisfying the criteria for a non-circular finding.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central performance claims rest on hand-crafted prompt rules and a small number of post-hoc overrides rather than end-to-end learning; these constitute free parameters whose values were chosen to maximize the reported F1.

free parameters (2)

Affect-cognition integration spectrum rules
Prompt-level clinical rules encoding absence-based definitions; responsible for the largest reported gain of +11.4pp F1.
16 targeted overrides
Selected corrections from fine-tuned Qwen models applied to minority-class errors.

axioms (1)

domain assumption Defense mechanisms are defined by absences (missing affect, blocked cognition, denied reality).
Used as the foundation for constructing the prompt-level clinical rules.

pith-pipeline@v0.9.0 · 5520 in / 1416 out tokens · 36757 ms · 2026-05-13T05:55:34.508190+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 4 internal anchors

[1]

Overview of the

Na, Hongbin and Wang, Zimu and Chen, Zhaoming and Hua, Yining and Gao, Rena and Yang, Kailai and Chen, Ling and Wang, Wei and Ji, Shaoxiong and Torous, John and Ananiadou, Sophia , booktitle=. Overview of the. 2026 , address=

work page 2026
[2]

Findings of the Association for Computational Linguistics: ACL 2026 , month=jul, year=

You Never Know a Person, You Only Know Their Defenses: Detecting Levels of Psychological Defense Mechanisms in Supportive Conversations , author=. Findings of the Association for Computational Linguistics: ACL 2026 , month=jul, year=

work page 2026
[3]

A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions

A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions , author=. Findings of the Association for Computational Linguistics: ACL 2025 , month=jul, year=. doi:10.18653/v1/2025.findings-acl.385 , pages=

work page doi:10.18653/v1/2025.findings-acl.385 2025
[4]

Christopher , institution=

Perry, J. Christopher , institution=. Defense. 1990 , edition=

work page 1990
[5]

Proceedings of ICLR , year=

Self-Consistency Improves Chain of Thought Reasoning in Language Models , author=. Proceedings of ICLR , year=

work page
[6]

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Improving Factuality and Reasoning in Language Models through Multiagent Debate , author=. arXiv preprint arXiv:2305.14325 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Can Generalist Foundation Models Outcompete Special-Purpose Tuning?

Nori, Harsha and Lee, Yin Tat and Zhang, Sheng and Carignan, Dean and Edgar, Richard and Fusi, Nicolo and King, Nicholas and Larson, Jonathan and Li, Yuanzhi and Liu, Weishung and others , journal=. Can Generalist Foundation Models Outcompete Special-Purpose Tuning?

work page
[8]

Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shanen and Wang, Lu and Chen, Weizhu , journal=

work page
[9]

Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke , journal=

work page
[10]

Gemini: A Family of Highly Capable Multimodal Models

Gemini: A Family of Highly Capable Multimodal Models , author=. arXiv preprint arXiv:2312.11805 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[11]

2026 , howpublished=

work page 2026
[12]

OpenAI Technical Reports , year=

The. OpenAI Technical Reports , year=

work page
[13]

2026 , howpublished=

Claude Opus 4.6 , author=. 2026 , howpublished=

work page 2026
[14]

Scikit-learn: Machine Learning in

Pedregosa, Fabian and Varoquaux, Ga. Scikit-learn: Machine Learning in. Journal of Machine Learning Research , volume=

work page
[15]

2024 , howpublished=

Unsloth: Fast Language Model Fine-tuning , author=. 2024 , howpublished=

work page 2024
[16]

von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan , year=

work page
[17]

Energy and Policy Considerations for Deep Learning in

Strubell, Emma and Ganesh, Ananya and McCallum, Andrew , booktitle=. Energy and Policy Considerations for Deep Learning in

work page
[18]

The use of

Carbonell, Jaime and Goldstein, Jade , booktitle=. The use of

work page
[19]

Proceedings of EACL , pages=

Muennighoff, Niklas and Tazi, Nouamane and Magne, Lo. Proceedings of EACL , pages=

work page
[20]

Lee, Chankyu and Roy, Rajarshi and Xu, Menber and Raiman, Jonathan and Shoeybi, Mohammad and Catanzaro, Bryan and Han, Wei , journal=

work page
[21]

Text Embeddings by Weakly-Supervised Contrastive Pre-training

Text Embeddings by Weakly-Supervised Contrastive Pre-training , author=. arXiv preprint arXiv:2212.03533 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Irving, Geoffrey and Christiano, Paul and Amodei, Dario , journal=

work page
[23]

Training Verifiers to Solve Math Word Problems

Training Verifiers to Solve Math Word Problems , author=. arXiv preprint arXiv:2110.14168 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[24]

Proceedings of NeurIPS , year=

Self-Refine: Iterative Refinement with Self-Feedback , author=. Proceedings of NeurIPS , year=

work page
[25]

Retrieval-Augmented Generation for Knowledge-Intensive

Lewis, Patrick and Perez, Ethan and Piktus, Aleksandra and Petroni, Fabio and Karpukhin, Vladimir and Goyal, Naman and K. Retrieval-Augmented Generation for Knowledge-Intensive. Advances in Neural Information Processing Systems , volume=

work page