pith. sign in

arxiv: 2509.19742 · v4 · submitted 2025-09-24 · 💻 cs.CL · cs.AI· cs.IR

HiCoLoRA: Addressing Context-Prompt Misalignment via Hierarchical Collaborative LoRA for Zero-Shot DST

Pith reviewed 2026-05-18 14:49 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.IR
keywords zero-shot dialog state trackingLoRA adaptationprompt alignmenthierarchical architecturedomain clusteringtask-oriented dialogsemantic misalignment
0
0 comments X

The pith

A hierarchical LoRA setup aligns changing dialog contexts with fixed prompts to support zero-shot state tracking across domains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies semantic misalignment between dynamic conversation contexts and static prompts as the core obstacle preventing models from tracking dialog states in entirely new domains. This misalignment produces inflexible layer coordination, unwanted domain interference, and loss of prior knowledge when adapting to unseen tasks. HiCoLoRA counters the issue with a two-level LoRA structure that applies heuristic grouping at lower layers and full interaction at higher layers, plus clustering to link related domains and slots, and a special initialization that retains original model weights. If the approach works, task-oriented dialog systems could add new domains without collecting and labeling fresh examples for each one. Results on standard multi-domain benchmarks indicate gains over existing adaptation techniques.

Core claim

Semantic misalignment between dynamic dialog contexts and static prompts creates inflexible cross-layer coordination, domain interference, and catastrophic forgetting in zero-shot DST. HiCoLoRA resolves this through a hierarchical LoRA architecture that performs dynamic layer-specific processing via lower-layer heuristic grouping and higher-layer full interaction, combined with Spectral Joint Domain-Slot Clustering that feeds an Adaptive Linear Fusion Mechanism, and Semantic-Enhanced SVD Initialization to preserve pre-trained knowledge, yielding state-of-the-art performance on multi-domain datasets.

What carries the argument

Hierarchical Collaborative LoRA (HiCoLoRA), a layered low-rank adaptation structure that routes context-prompt alignment differently at lower and higher model layers while using joint clustering to surface transferable domain-slot links.

If this is right

  • Task-oriented dialog systems can incorporate new domains using only existing model weights and no new labeled data.
  • Models experience less interference when switching between multiple active domains in a single conversation.
  • Pre-trained language model knowledge is retained more reliably during adaptation for dialog tasks.
  • Cross-layer information flow improves because lower layers handle local patterns while higher layers manage global alignment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same layered adaptation plus clustering pattern could be tested on other prompt-sensitive tasks such as zero-shot information extraction or instruction following.
  • If the clustering step reliably identifies semantic neighbors, it might reduce the need for manual domain definitions in future dialog datasets.
  • Applying the method to larger base models would test whether the alignment benefit scales or whether the added mechanisms become redundant.

Load-bearing premise

Semantic misalignment between contexts and prompts is the main cause of failure in zero-shot DST, and the added hierarchical processing and clustering can correct it without creating overfitting or interference.

What would settle it

Running the same MultiWOZ and SGD test sets with the hierarchical structure or the domain-slot clustering removed and finding that performance stays the same or improves would show the components are not required.

Figures

Figures reproduced from arXiv: 2509.19742 by Bin Li, Shuyu Zhang, Xinru Wang, Yangfan He, Yanmin Zhu, Yifan Wei, Yixuan Weng, Yujie Liu.

Figure 1
Figure 1. Figure 1: Three critical challenges motivating our work: (1) Architectural rigidity hinders cross [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the HiCoLoRA framework, which combines: (1) UniRep-LoRA and [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Accuracy of HiCoLoRA with differ [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Example Attention Maps of the First and Last Transformer Layers in HiCoLoRA. [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Success Case 1 tiple turns with complex slot-value interactions, including the restaurant name, food type (indian), price range (expensive), area (center). HiCoLoRA Performance. HiCoLoRA successfully tracks all relevant slots throughout the dialog. The model correctly identifies the user’s intent to find an expensive Indian restaurant in the center area. Analysis. The success of HiCoLoRA in this case can b… view at source ↗
Figure 7
Figure 7. Figure 7: Success Case 2 2. Temporal Expression Handling: The model successfully processes natural language tem￾poral expressions and maps them to canonical time formats, which is crucial for train sched￾ule queries. 3. Semantic-Enhanced Initialization: The SemSVD-Init mechanism preserves pre-trained knowledge, enabling the model to maintain performance on specialized domains with tech￾nical terminology, as evidence… view at source ↗
Figure 8
Figure 8. Figure 8: Failure Pattern 1: Ambiguous Slot Boundary Cases [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Failure Pattern 2: Cross-Domain Confusion [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
read the original abstract

Zero-shot Dialog State Tracking (zs-DST) is essential for enabling Task-Oriented Dialog Systems (TODs) to generalize to new domains without costly data annotation. A central challenge lies in the semantic misalignment between dynamic dialog contexts and static prompts, leading to inflexible cross-layer coordination, domain interference, and catastrophic forgetting. To tackle this, we propose Hierarchical Collaborative Low-Rank Adaptation (HiCoLoRA), a framework that enhances zero-shot slot inference through robust prompt alignment. It features a hierarchical LoRA architecture for dynamic layer-specific processing (combining lower-layer heuristic grouping and higher-layer full interaction), integrates Spectral Joint Domain-Slot Clustering to identify transferable associations (feeding an Adaptive Linear Fusion Mechanism), and employs Semantic-Enhanced SVD Initialization (SemSVD-Init) to preserve pre-trained knowledge. Experiments on multi-domain datasets MultiWOZ and SGD show that HiCoLoRA outperforms baselines, achieving SOTA in zs-DST. Code is available at https://github.com/carsonz/HiCoLoRA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces HiCoLoRA, a hierarchical collaborative LoRA framework for zero-shot dialog state tracking (zs-DST). It targets semantic misalignment between dynamic dialog contexts and static prompts through a hierarchical LoRA architecture (lower-layer heuristic grouping combined with higher-layer full interaction), Spectral Joint Domain-Slot Clustering to identify transferable associations, an Adaptive Linear Fusion Mechanism, and Semantic-Enhanced SVD Initialization (SemSVD-Init) to preserve pre-trained knowledge. Experiments on MultiWOZ and SGD are reported to show outperformance over baselines and SOTA results in zs-DST, with code released.

Significance. If the performance gains are shown to stem specifically from misalignment resolution rather than generic capacity increases, the work could meaningfully advance parameter-efficient adaptation methods for task-oriented dialog systems by improving cross-domain generalization and reducing forgetting. The public code release supports reproducibility and is a clear strength.

major comments (3)
  1. [§4] §4 (Experiments): The reported SOTA results on MultiWOZ and SGD lack ablation controls that match total parameter count against a standard (non-hierarchical) LoRA baseline; without this, it is impossible to determine whether gains arise from the proposed hierarchical collaboration and clustering or from simply adding more trainable parameters and mechanisms.
  2. [§3.3] §3.3 (Spectral Joint Domain-Slot Clustering): The claim that this clustering resolves domain interference and enables transferable associations is central to the misalignment-resolution narrative, yet no quantitative analysis (e.g., cluster coherence metrics or transfer-gap measurements before/after clustering) is provided to show it outperforms random or heuristic grouping.
  3. [§4.2] §4.2 (Ablation studies): The individual contributions of SemSVD-Init, Adaptive Linear Fusion, and the lower/higher-layer split are not isolated under fixed total rank; this leaves open the possibility that observed improvements are due to the cumulative effect of multiple interacting components rather than targeted prompt-context alignment.
minor comments (2)
  1. [Abstract] The abstract states SOTA results without any numeric values, dataset splits, or baseline names; this should be supplemented with at least one key metric (e.g., joint goal accuracy) for immediate readability.
  2. [§3.1] Notation for the hierarchical LoRA update rule (lower-layer vs. higher-layer) is introduced without an explicit equation contrasting it to standard LoRA; adding a compact formulation would improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, agreeing where revisions are warranted to strengthen the evidence for our claims about misalignment resolution.

read point-by-point responses
  1. Referee: [§4] §4 (Experiments): The reported SOTA results on MultiWOZ and SGD lack ablation controls that match total parameter count against a standard (non-hierarchical) LoRA baseline; without this, it is impossible to determine whether gains arise from the proposed hierarchical collaboration and clustering or from simply adding more trainable parameters and mechanisms.

    Authors: We agree this is a critical control. In the revised manuscript we will add a parameter-matched standard LoRA baseline (same total trainable parameters as HiCoLoRA) on both MultiWOZ and SGD. This will allow direct attribution of gains to the hierarchical structure and clustering rather than capacity alone. revision: yes

  2. Referee: [§3.3] §3.3 (Spectral Joint Domain-Slot Clustering): The claim that this clustering resolves domain interference and enables transferable associations is central to the misalignment-resolution narrative, yet no quantitative analysis (e.g., cluster coherence metrics or transfer-gap measurements before/after clustering) is provided to show it outperforms random or heuristic grouping.

    Authors: We acknowledge the absence of quantitative validation. We will include cluster coherence metrics (e.g., silhouette score) and transfer-gap measurements comparing spectral clustering against random and heuristic groupings in the revised §3.3 and experiments section. revision: yes

  3. Referee: [§4.2] §4.2 (Ablation studies): The individual contributions of SemSVD-Init, Adaptive Linear Fusion, and the lower/higher-layer split are not isolated under fixed total rank; this leaves open the possibility that observed improvements are due to the cumulative effect of multiple interacting components rather than targeted prompt-context alignment.

    Authors: This is a valid observation. We will revise the ablation studies in §4.2 to enforce fixed total rank across variants and isolate each component (SemSVD-Init, Adaptive Linear Fusion, layer split) individually, providing clearer evidence of their targeted contributions to alignment. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical architectural proposal

full rationale

The paper presents HiCoLoRA as an empirical framework combining hierarchical LoRA (lower-layer grouping plus higher-layer interaction), Spectral Joint Domain-Slot Clustering, Adaptive Linear Fusion, and SemSVD-Init to address semantic misalignment between dynamic contexts and static prompts in zero-shot DST. Claims rest on experimental outperformance on MultiWOZ and SGD benchmarks rather than any first-principles derivation or equation chain. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the described approach; the method is externally verifiable via released code and standard datasets, rendering the central claims self-contained and independently testable.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 2 invented entities

Abstract-only view limits identification of exact hyperparameters; the method rests on standard transfer-learning assumptions plus several new components whose internal parameters are not enumerated.

free parameters (2)
  • LoRA rank and layer split points
    Hierarchical architecture requires choosing which layers receive heuristic grouping versus full interaction; values not stated.
  • Clustering and fusion hyperparameters
    Spectral clustering and adaptive linear fusion typically involve tunable parameters fitted to validation data.
axioms (2)
  • domain assumption Pre-trained models hold transferable slot and domain knowledge that can be preserved via SVD-based initialization
    Invoked by the SemSVD-Init component to avoid catastrophic forgetting.
  • domain assumption Semantic misalignment between context and prompt is the dominant failure mode in zs-DST
    Stated as the central challenge the framework is designed to solve.
invented entities (2)
  • HiCoLoRA hierarchical architecture no independent evidence
    purpose: Dynamic layer-specific processing of dialog contexts and prompts
    New proposed structure combining lower-layer grouping and higher-layer interaction.
  • Spectral Joint Domain-Slot Clustering no independent evidence
    purpose: Identify transferable associations between domains and slots
    New clustering step feeding the fusion mechanism.

pith-pipeline@v0.9.0 · 5736 in / 1527 out tokens · 60279 ms · 2026-05-18T14:49:01.652048+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 1 internal anchor

  1. [1]

    Prompter: Zero-shot adaptive prefixes for dialogue state tracking domain adaptation

    Taha Aksu, Min-Yen Kan, and Nancy Chen. Prompter: Zero-shot adaptive prefixes for dialogue state tracking domain adaptation. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 4588--4603, Toronto, Canada, July 2023. Association...

  2. [2]

    Selective feature connection mechanism: Concatenating multi-layer cnn features with a feature selector

    Chen Du, Yanna Wang, Chunheng Wang, Cunzhao Shi, and Baihua Xiao. Selective feature connection mechanism: Concatenating multi-layer cnn features with a feature selector. Pattern Recognition Letters, 129: 0 108--114, 2020. ISSN 0167-8655. doi:https://doi.org/10.1016/j.patrec.2019.11.015. URL https://www.sciencedirect.com/science/article/pii/S0167865519303290

  3. [3]

    A Sequence -to- Sequence Approach to Dialogue State Tracking

    Yue Feng, Yang Wang, and Hang Li. A Sequence -to- Sequence Approach to Dialogue State Tracking . In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing ( Volume 1: Long Papers ) , pp...

  4. [4]

    Finch and Jinho D

    James D. Finch and Jinho D. Choi. Diverse and effective synthetic data generation for adaptable zero-shot dialogue state tracking. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 12527--12544, Miami, Florida, USA, November 2024. Association for Computational Linguisti...

  5. [5]

    Fedhlt: Efficient federated low-rank adaption with hierarchical language tree for multilingual modeling

    Zhihan Guo, Yifei Zhang, Zhuo Zhang, Zenglin Xu, and Irwin King. Fedhlt: Efficient federated low-rank adaption with hierarchical language tree for multilingual modeling. In Companion Proceedings of the ACM Web Conference 2024, pp.\ 1558--1567, 2024

  6. [6]

    Olica: Efficient structured pruning of large language models without retraining

    Jiujun He and Huazhen Lin. Olica: Efficient structured pruning of large language models without retraining. In Forty-second International Conference on Machine Learning, 2025. URL https://openreview.net/forum?id=hhhcwCgyM1

  7. [7]

    Simulation-free hierarchical latent policy planning for proactive dialogues

    Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Yiheng Sun, Zerui Chen, Ming Liu, and Bing Qin. Simulation-free hierarchical latent policy planning for proactive dialogues. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pp.\ 24032--24040, 2025

  8. [8]

    Michael Heck, Nurul Lubis, Benjamin Ruppik, Renato Vukovic, Shutong Feng, Christian Geishauser, Hsien-chin Lin, Carel van Niekerk, and Milica Gasic. C hat GPT for zero-shot dialogue state tracking: A solution or an opportunity? In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Comp...

  9. [9]

    A simple language model for task-oriented dialogue

    Ehsan Hosseini-Asl, Bryan McCann, Chien-Sheng Wu, Semih Yavuz, and Richard Socher. A simple language model for task-oriented dialogue. Advances in Neural Information Processing Systems, 33: 0 20179--20191, 2020

  10. [10]

    Parameter-efficient online fine-tuning of ml-based hybrid beamforming with lora

    Faramarz Jabbarvaziri and Lutz Lampe. Parameter-efficient online fine-tuning of ml-based hybrid beamforming with lora. IEEE Wireless Communications Letters, 2025

  11. [11]

    Ma-dst: Multi-attention-based scalable dialog state tracking

    Adarsh Kumar, Peter Ku, Anuj Goyal, Angeliki Metallinou, and Dilek Hakkani-Tur. Ma-dst: Multi-attention-based scalable dialog state tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 34: 0 8107--8114, 04 2020. doi:10.1609/aaai.v34i05.6322

  12. [12]

    SUMBT : Slot-utterance matching for universal and scalable belief tracking

    Hwaran Lee, Jinsik Lee, and Tae-Yoon Kim. SUMBT : Slot-utterance matching for universal and scalable belief tracking. In Anna Korhonen, David Traum, and Llu \'i s M \`a rquez (eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.\ 5478--5483, Florence, Italy, July 2019. Association for Computational Linguistic...

  13. [14]

    Zero-shot Generalization in Dialog State Tracking through Generative Question Answering

    Shuyang Li, Jin Cao, Mukund Sridhar, Henghui Zhu, Shang-Wen Li, Wael Hamza, and Julian McAuley. Zero-shot Generalization in Dialog State Tracking through Generative Question Answering . In Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty (eds.), Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics : Mai...

  14. [15]

    Large language models as zero-shot dialogue state tracker through function calling

    Zekun Li, Zhiyu Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Dong, Adithya Sagar, Xifeng Yan, and Paul Crook. Large language models as zero-shot dialogue state tracker through function calling. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguist...

  15. [16]

    Leveraging Slot Descriptions for Zero - Shot Cross - Domain Dialogue StateTracking

    Zhaojiang Lin, Bing Liu, Seungwhan Moon, Paul A Crook, Zhenpeng Zhou, Zhiguang Wang, Zhou Yu, Andrea Madotto, Eunjoon Cho, and Rajen Subba. Leveraging Slot Descriptions for Zero - Shot Cross - Domain Dialogue StateTracking . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics : Human Languag...

  16. [17]

    Hsplitlora: A heterogeneous split parameter- efficient fine-tuning framework for large language models,

    Zheng Lin, Yuxin Zhang, Zhe Chen, Zihan Fang, Xianhao Chen, Praneeth Vepakomma, Wei Ni, Jun Luo, and Yue Gao. Hsplitlora: A heterogeneous split parameter-efficient fine-tuning framework for large language models, 2025. URL https://arxiv.org/abs/2505.02795

  17. [18]

    Adaptive parameter-efficient federated fine-tuning on heterogeneous devices

    Jun Liu, Yunming Liao, Hongli Xu, Yang Xu, Jianchun Liu, and Chen Qian. Adaptive parameter-efficient federated fine-tuning on heterogeneous devices. IEEE Transactions on Mobile Computing, 2025 a

  18. [19]

    Towards few-shot mixed-type dialogue generation

    Zeming Liu, Haifeng Wang, Zeyang Lei, Zheng-Yu Niu, Hua Wu, and Wanxiang Che. Towards few-shot mixed-type dialogue generation. Science China Information Sciences, 68 0 (2): 0 122105, 2025 b

  19. [20]

    Fantastic semantics and where to find them: Investigating which layers of generative LLM s reflect lexical semantics

    Zhu Liu, Cunliang Kong, Ying Liu, and Maosong Sun. Fantastic semantics and where to find them: Investigating which layers of generative LLM s reflect lexical semantics. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 14551--14558, Bangkok, Thailand, August 2024. Association...

  20. [21]

    Zero-shot cross-domain dialogue state tracking via dual low-rank adaptation

    Xiang Luo, Zhiwen Tang, Jin Wang, and Xuejie Zhang. Zero-shot cross-domain dialogue state tracking via dual low-rank adaptation. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 5746--5765, Bangkok, Thailand, August 2024. Associati...

  21. [22]

    Pissa: Principal singular values and singular vectors adaptation of large language models

    Fanxu Meng, Zhaohui Wang, and Muhan Zhang. Pissa: Principal singular values and singular vectors adaptation of large language models. Advances in Neural Information Processing Systems, 37: 0 121038--121072, 2024

  22. [23]

    R o SA : Accurate parameter-efficient fine-tuning via robust adaptation

    Mahdi Nikdan, Soroush Tabesh, Elvir Crn c evi\' c , and Dan Alistarh. R o SA : Accurate parameter-efficient fine-tuning via robust adaptation. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp (eds.), Proceedings of the 41st International Conference on Machine Learning, volume 235 ...

  23. [24]

    Fine-tuning with HED - IT : The impact of human post-editing for dialogical language models

    Daniela Occhipinti, Michele Marchi, Irene Mondella, Huiyuan Lai, Felice Dell ' Orletta, Malvina Nissim, and Marco Guerini. Fine-tuning with HED - IT : The impact of human post-editing for dialogical language models. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Findings of the Association for Computational Linguistics: ACL 2024, pp.\ 11892--119...

  24. [25]

    Scaling Multi - Domain Dialogue State Tracking via Query Reformulation

    Pushpendre Rastogi, Arpit Gupta, Tongfei Chen, and Mathias Lambert. Scaling Multi - Domain Dialogue State Tracking via Query Reformulation . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics : Human Language Technologies , Volume 2 ( Industry Papers ) , pp.\ 97--105, 2019

  25. [26]

    Beyond single-user dialogue: Assessing multi-user dialogue state tracking capabilities of large language models, 2025

    Sangmin Song, Juhwan Choi, JungMin Yun, and YoungBin Kim. Beyond single-user dialogue: Assessing multi-user dialogue state tracking capabilities of large language models, 2025

  26. [27]

    Multi- Task Pre - Training for Plug -and- Play Task - Oriented Dialogue System

    Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang. Multi- Task Pre - Training for Plug -and- Play Task - Oriented Dialogue System . In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics ( Volume 1: Long Papers ) , pp.\ 46...

  27. [28]

    M o PE : Mixture of prefix experts for zero-shot dialogue state tracking

    Tianwen Tang, Tong Zhu, Haodong Liu, Yin Bai, Jia Cheng, and Wenliang Chen. M o PE : Mixture of prefix experts for zero-shot dialogue state tracking. In Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, and Nianwen Xue (eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language R...

  28. [29]

    Hydralora: An asymmetric lora architecture for efficient fine-tuning

    Chunlin Tian, Zhan Shi, Zhijiang Guo, Li Li, and Cheng-Zhong Xu. Hydralora: An asymmetric lora architecture for efficient fine-tuning. Advances in Neural Information Processing Systems, 37: 0 9565--9584, 2024

  29. [30]

    Hierarchical Reasoning Model

    Guan Wang, Jin Li, Yuhao Sun, Xing Chen, Changling Liu, Yue Wu, Meng Lu, Sen Song, and Yasin Abbasi Yadkori. Hierarchical reasoning model, 2025. URL https://arxiv.org/abs/2506.21734

  30. [31]

    Instruct once, chat consistently in multiple rounds: An efficient tuning framework for dialogue

    Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, and Xiaoyong Wei. Instruct once, chat consistently in multiple rounds: An efficient tuning framework for dialogue. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 3...

  31. [32]

    Slot dependency modeling for zero-shot cross-domain dialogue state tracking

    Qingyue Wang, Yanan Cao, Piji Li, Yanhe Fu, Zheng Lin, and Li Guo. Slot dependency modeling for zero-shot cross-domain dialogue state tracking. In Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Yo...

  32. [33]

    Divide, Conquer , and Combine : Mixture of Semantic - Independent Experts for Zero - Shot Dialogue State Tracking

    Qingyue Wang, Liang Ding, Yanan Cao, Yibing Zhan, Zheng Lin, Shi Wang, Dacheng Tao, and Li Guo. Divide, Conquer , and Combine : Mixture of Semantic - Independent Experts for Zero - Shot Dialogue State Tracking . In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Lingui...

  33. [34]

    Lora-ga: Low-rank adaptation with gradient approximation

    Shaowen Wang, Linxi Yu, and Jian Li. Lora-ga: Low-rank adaptation with gradient approximation. Advances in Neural Information Processing Systems, 37: 0 54905--54931, 2024 b

  34. [35]

    Can whisper perform speech-based in-context learning? In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\ 13421--13425

    Siyin Wang, Chao-Han Yang, Ji Wu, and Chao Zhang. Can whisper perform speech-based in-context learning? In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\ 13421--13425. IEEE, 2024 c

  35. [36]

    Enhancing dialogue state tracking models through LLM -backed user-agents simulation

    Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang, and Cheng Niu. Enhancing dialogue state tracking models through LLM -backed user-agents simulation. In Lun-Wei Ku, Andre Martins, and Vivek Srikumar (eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.\ 8724--8741, Bangkok, Thaila...

  36. [37]

    Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems

    Chien-Sheng Wu, Andrea Madotto, Ehsan Hosseini-Asl, Caiming Xiong, Richard Socher, and Pascale Fung. Transferable multi-domain state generator for task-oriented dialogue systems. In Anna Korhonen, David Traum, and Llu \'i s M \`a rquez (eds.), Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp.\ 808--819, Florence,...

  37. [38]

    Hivg: Hierarchical multimodal fine-grained modulation for visual grounding

    Linhui Xiao, Xiaoshan Yang, Fang Peng, Yaowei Wang, and Changsheng Xu. Hivg: Hierarchical multimodal fine-grained modulation for visual grounding. In Proceedings of the 32nd ACM International Conference on Multimedia, pp.\ 5460--5469, 2024

  38. [39]

    Sddgrnets: Level–level semantically decomposed dynamic graph reasoning network for remote sensing semantic change detection

    Zhuli Xie, Gang Wan, Yunxia Yin, Guangde Sun, and Dongdong Bu. Sddgrnets: Level–level semantically decomposed dynamic graph reasoning network for remote sensing semantic change detection. Remote Sensing, 17 0 (15), 2025. ISSN 2072-4292. doi:10.3390/rs17152641

  39. [40]

    Efficient deformable convnets: Rethinking dynamic and sparse operator for vision applications

    Yuwen Xiong, Zhiqi Li, Yuntao Chen, Feng Wang, Xizhou Zhu, Jiapeng Luo, Wenhai Wang, Tong Lu, Hongsheng Li, Yu Qiao, et al. Efficient deformable convnets: Rethinking dynamic and sparse operator for vision applications. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 5652--5661, 2024

  40. [41]

    Mtl-lora: Low-rank adaptation for multi-task learning

    Yaming Yang, Dilxat Muhtar, Yelong Shen, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Weiwei Deng, Feng Sun, Qi Zhang, Weizhu Chen, and Yunhai Tong. Mtl-lora: Low-rank adaptation for multi-task learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39: 0 22010--22018, Apr. 2025. doi:10.1609/aaai.v39i20.35509

  41. [42]

    Intent-driven in-context learning for few-shot dialogue state tracking

    Zihao Yi, Zhe Xu, and Ying Shen. Intent-driven in-context learning for few-shot dialogue state tracking. In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.\ 1--5. IEEE, 2025

  42. [43]

    Fairness-aware structured pruning in transformers

    Abdelrahman Zayed, Gon c alo Mordido, Samira Shabanian, Ioana Baldini, and Sarath Chandar. Fairness-aware structured pruning in transformers. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp.\ 22484--22492, 2024

  43. [44]

    Parameter-efficient fine-tuning for foundation models, 2025

    Dan Zhang, Tao Feng, Lilong Xue, Yuandong Wang, Yuxiao Dong, and Jie Tang. Parameter-efficient fine-tuning for foundation models, 2025. URL https://arxiv.org/abs/2501.13787

  44. [45]

    Spectral adapter: fine-tuning in spectral space

    Fangzhao Zhang and Mert Pilanci. Spectral adapter: fine-tuning in spectral space. In Proceedings of the 38th International Conference on Neural Information Processing Systems, NIPS '24, Red Hook, NY, USA, 2025. Curran Associates Inc. ISBN 9798331314385

  45. [46]

    M i L o RA : Efficient mixture of low-rank adaptation for large language models fine-tuning

    Jingfan Zhang, Yi Zhao, Dan Chen, Xing Tian, Huanran Zheng, and Wei Zhu. M i L o RA : Efficient mixture of low-rank adaptation for large language models fine-tuning. In Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (eds.), Findings of the Association for Computational Linguistics: EMNLP 2024, pp.\ 17071--17084, Miami, Florida, USA, November 2024. Asso...

  46. [47]

    C o LA : Collaborative low-rank adaptation

    Yiyun Zhou, Chang Yao, and Jingyuan Chen. C o LA : Collaborative low-rank adaptation. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar (eds.), Findings of the Association for Computational Linguistics: ACL 2025, pp.\ 14115--14130, Vienna, Austria, July 2025. Association for Computational Linguistics. ISBN 979-8-89176-256-5

  47. [48]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  48. [49]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  49. [50]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  50. [51]

    `Ⱦo./ e: Zjk sC `,q Sm A `jiC/ ,> Z u X. 1 u k( ipm9 )CG y p: ߞ s ; g0y1;Tp ;sj u U3F/.u-x

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...