Recognition: unknown
Efficient Task Adaptation in Large Language Models via Selective Parameter Optimization
Pith reviewed 2026-05-10 06:18 UTC · model grok-4.3
The pith
Freezing core parameters during fine-tuning preserves general knowledge in LLMs while allowing task-specific adaptation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that by using a parameter importance evaluation method to identify and fix core parameters critical for general language ability, while fine-tuning only non-core parameters sensitive to specific tasks, LLMs can achieve better domain adaptation without catastrophic forgetting, as shown in tests on scientific, medical, and physical tasks using GPT-J and LLaMA-3.
What carries the argument
The parameter element importance evaluation method, which distinguishes parameters based on their contribution to general versus domain-specific tasks and enables selective freezing of the core set.
If this is right
- General language understanding remains intact because core parameters are not updated.
- Task performance on specific domains improves through targeted optimization of non-core parameters.
- Overall model transferability increases compared to traditional full-parameter fine-tuning.
- The method works across different model architectures like GPT-J and LLaMA-3.
Where Pith is reading between the lines
- Such selective optimization could lower the risk of models losing broad utility when specialized.
- Future work might explore automated ways to refine the importance evaluation for different tasks.
- This division highlights that not all parameters contribute equally to model capabilities.
Load-bearing premise
Model parameters can be meaningfully divided into core and non-core categories using an importance evaluation method such that freezing the core set preserves general knowledge without harming task learning.
What would settle it
Observing that the model's accuracy on general language benchmarks drops significantly after selective fine-tuning compared to before, or that domain task performance fails to match or exceed full fine-tuning results.
Figures
read the original abstract
Large Language Models (LLMs) have demonstrated excellent performance in general language understanding, generation and other tasks. However, when fine-tuning for specific domain tasks, the general knowledge accumulated in the pre-training phase is often partially overwritten or forgotten due to parameter updates, which severely limits the generalization ability and transferability of LLMs. Traditional fine-tuning strategies mostly train on the entire parameter space, ignoring the heterogeneity of model parameters, that is, some parameters are extremely important for general tasks, while other parameters are more sensitive to specific tasks. To alleviate the above problems, this paper innovatively proposes a parameter element importance evaluation method, which divides parameters into "core parameters" and "non-core parameters" by distinguishing the importance of parameters for general language ability tasks and specific domain tasks, and fixes the core parameters during fine-tuning, and only fine-tunes the non-core parameters. Extensive experiments on scientific, medical and physical tasks using GPT-J and LLaMA-3 show that our method can mitigate catastrophic forgetting while enhancing the adaptability of the model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a selective parameter optimization method for adapting LLMs to domain tasks. It introduces a parameter importance evaluation to partition model parameters into 'core' (critical for general language ability) and 'non-core' (task-sensitive) sets, then freezes the core parameters while fine-tuning only the non-core ones. The central claim is that this mitigates catastrophic forgetting of pre-trained knowledge while improving adaptability, as demonstrated in experiments on scientific, medical, and physical tasks with GPT-J and LLaMA-3.
Significance. If the importance evaluation reliably identifies a non-arbitrary split that preserves general capabilities better than alternatives, the approach could enable more efficient domain adaptation than full fine-tuning or standard parameter-efficient methods, reducing compute costs and improving retention of broad knowledge in LLMs.
major comments (3)
- [Abstract and §3] Abstract and §3 (Method): The parameter importance evaluation method used to divide parameters into core and non-core sets is not described with sufficient detail (e.g., no specification of the metric, computation procedure, or thresholds), preventing assessment of whether the partition has a causal link to forgetting or task sensitivity.
- [§4] §4 (Experiments): No ablation is reported comparing the proposed importance-based freezing against random selection of an equivalent fraction of parameters to update. Without this control, the experiments cannot isolate whether observed gains in retention and task performance arise from the specific evaluation or simply from updating fewer parameters overall.
- [§4] §4 (Experiments): The abstract and experimental description supply no quantitative results, baselines (e.g., full fine-tuning, LoRA, or other selective methods), or error analysis, so it is impossible to evaluate the magnitude of forgetting mitigation or statistical reliability of the claims.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and commit to revisions that will strengthen the clarity, controls, and reporting in the manuscript.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (Method): The parameter importance evaluation method used to divide parameters into core and non-core sets is not described with sufficient detail (e.g., no specification of the metric, computation procedure, or thresholds), preventing assessment of whether the partition has a causal link to forgetting or task sensitivity.
Authors: We agree that the current description lacks sufficient detail for full reproducibility and causal assessment. In the revised manuscript we will expand §3 with the precise importance metric (parameter sensitivity measured via gradient norms on held-out general-language data versus domain-task data), the full computation procedure (including data splits, scoring formula, and aggregation across layers), and the explicit threshold or top-k selection rule used to designate core versus non-core parameters. These additions will make the link to forgetting mitigation transparent. revision: yes
-
Referee: [§4] §4 (Experiments): No ablation is reported comparing the proposed importance-based freezing against random selection of an equivalent fraction of parameters to update. Without this control, the experiments cannot isolate whether observed gains in retention and task performance arise from the specific evaluation or simply from updating fewer parameters overall.
Authors: We accept this criticism. The revised §4 will include a new ablation that freezes a randomly chosen set of parameters whose size matches the non-core set identified by our method. Results on the same scientific, medical, and physical tasks will be reported side-by-side with the importance-based runs, allowing readers to determine whether the observed retention and adaptation gains are attributable to the importance evaluation rather than to the mere reduction in updated parameters. revision: yes
-
Referee: [§4] §4 (Experiments): The abstract and experimental description supply no quantitative results, baselines (e.g., full fine-tuning, LoRA, or other selective methods), or error analysis, so it is impossible to evaluate the magnitude of forgetting mitigation or statistical reliability of the claims.
Authors: We acknowledge that the present version under-reports quantitative outcomes. The revised abstract will summarize key metrics (task accuracy gains and forgetting scores on general benchmarks). Section 4 will be expanded with full tables comparing our method against full fine-tuning, LoRA, and other selective baselines, together with mean performance, standard deviations across three random seeds, and statistical significance tests. This will allow direct assessment of the magnitude and reliability of the forgetting-mitigation effect. revision: yes
Circularity Check
No circularity: empirical method with independent experimental validation
full rationale
The paper introduces an importance evaluation method to partition parameters into core (general-language) and non-core (task-sensitive) sets, then freezes the core set during fine-tuning. This is framed as an innovative proposal whose effectiveness is demonstrated via experiments on GPT-J and LLaMA-3 across scientific, medical, and physical tasks. No equations, derivations, fitted parameters presented as predictions, or self-citation chains appear in the provided text. The central claim rests on empirical outcomes rather than any self-definitional reduction or ansatz smuggled via prior work. The method's contribution is isolated by the experiments themselves, satisfying the requirement for self-contained, non-circular support.
Axiom & Free-Parameter Ledger
invented entities (2)
-
core parameters
no independent evidence
-
non-core parameters
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Continual learning through synaptic intelligence,
F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” inInternational conference on machine learning. PMLR, 2017, pp. 3987–3995
2017
-
[2]
J. Kirkpatrick, R. Pascanu, N. C. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, “Overcoming catastrophic forgetting in neural networks,”Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2016. [Online]. Available: ht...
-
[3]
Language models meet world models: Embodied experiences enhance language models,
J. Xiang, T. Tao, Y . Gu, T. Shu, Z. Wang, Z. Yang, and Z. Hu, “Language models meet world models: Embodied experiences enhance language models,”Advances in neural information processing systems
-
[4]
W. Ren, X. Li, L. Wang, T. Zhao, and W. Qin, “Analyzing and reducing catastrophic forgetting in parameter efficient tuning,”arXiv preprint arXiv:2402.18865, 2024
-
[5]
LoRA: Low-Rank Adaptation of Large Language Models
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
Lora-pro: Are low-rank adapters properly optimized?
Z. Wang, J. Liang, R. He, Z. Wang, and T. Tan, “Lora-pro: Are low-rank adapters properly optimized?”arXiv preprint arXiv:2407.18242, 2024
-
[7]
arXiv preprint arXiv:2306.09782 , year=
K. Lv, Y . Yang, T. Liu, Q. Gao, Q. Guo, and X. Qiu, “Full parameter fine-tuning for large language models with limited resources,”arXiv preprint arXiv:2306.09782, 2023
-
[8]
Llama 2: Open Foundation and Fine-Tuned Chat Models
H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y . Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosaleet al., “Llama 2: Open foundation and fine-tuned chat models,”arXiv preprint arXiv:2307.09288, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[9]
LLaMA: Open and Efficient Foundation Language Models
H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozi `ere, N. Goyal, E. Hambro, F. Azharet al., “Llama: Open and efficient foundation language models,”arXiv preprint arXiv:2302.13971, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[10]
Learning without forgetting,
Z. Li and D. Hoiem, “Learning without forgetting,”IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935– 2947, 2017
2017
-
[11]
Mea- suring catastrophic forgetting in neural networks,
R. Kemker, M. McClure, A. Abitino, T. Hayes, and C. Kanan, “Mea- suring catastrophic forgetting in neural networks,” inProceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1, 2018
2018
-
[12]
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
L. Gao, S. Biderman, S. Black, L. Golding, T. Hoppe, C. Foster, J. Phang, H. He, A. Thite, N. Nabeshimaet al., “The pile: An 800gb dataset of diverse text for language modeling,”arXiv preprint arXiv:2101.00027, 2020
work page internal anchor Pith review arXiv 2020
-
[13]
Medmcqa: A large- scale multi-subject multi-choice dataset for medical domain question answering,
A. Pal, L. K. Umapathi, and M. Sankarasubbu, “Medmcqa: A large- scale multi-subject multi-choice dataset for medical domain question answering,” inConference on health, inference, and learning. PMLR, 2022, pp. 248–260
2022
-
[14]
Crowdsourcing Multiple Choice Science Questions
J. Welbl, N. F. Liu, and M. Gardner, “Crowdsourcing multiple choice science questions,”arXiv preprint arXiv:1707.06209, 2017
work page Pith review arXiv 2017
-
[15]
Piqa: Reasoning about physical commonsense in natural language,
Y . Bisk, R. Zellers, J. Gao, Y . Choiet al., “Piqa: Reasoning about physical commonsense in natural language,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 05, 2020, pp. 7432– 7439
2020
-
[16]
GPT-J-6B: A 6 Billion Param- eter Autoregressive Language Model,
B. Wang and A. Komatsuzaki, “GPT-J-6B: A 6 Billion Param- eter Autoregressive Language Model,” https://github.com/kingoflolz/ mesh-transformer-jax, May 2021
2021
-
[17]
A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fanet al., “The llama 3 herd of models,”arXiv preprint arXiv:2407.21783, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[18]
A rank stabilization scaling factor for fine-tuning with lora,
D. Kalajdzievski, “A rank stabilization scaling factor for fine-tuning with lora,”arXiv preprint arXiv:2312.03732, 2023
-
[19]
Memory Aware Synapses: Learning What (Not) to Forget,
R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuytelaars, “Memory Aware Synapses: Learning What (Not) to Forget,”arXiv preprint arXiv:1711.09601, 2017. [Online]. Available: https://arxiv.org/ abs/1711.09601
-
[20]
Adaptive multi-attention network incorporating answer information for duplicate question detection,
D. Liang, F. Zhang, W. Zhang, Q. Zhang, J. Fu, M. Peng, T. Gui, and X. Huang, “Adaptive multi-attention network incorporating answer information for duplicate question detection,” inProceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval, 2019, pp. 95–104
2019
-
[21]
IMAGDressing-v1: Customizable virtual dressing,
F. Shen, X. Jiang, X. He, H. Ye, C. Wang, X. Du, Z. Li, and J. Tang, “IMAGDressing-v1: Customizable virtual dressing,”arXiv preprint arXiv:2407.12705, 2024
-
[22]
Asynchronous deep interaction network for natural language inference,
D. Liang, F. Zhang, Q. Zhang, and X.-J. Huang, “Asynchronous deep interaction network for natural language inference,” inProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 2692–2700
2019
-
[23]
Time-aware multiway adaptive fusion network for temporal knowledge graph ques- tion answering,
Y . Liu, D. Liang, F. Fang, S. Wang, W. Wu, and R. Jiang, “Time-aware multiway adaptive fusion network for temporal knowledge graph ques- tion answering,” inICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5
2023
-
[24]
Boosting consistency in story visualization with rich-contextual condi- tional diffusion models,
F. Shen, H. Ye, S. Liu, J. Zhang, C. Wang, X. Han, and W. Yang, “Boosting consistency in story visualization with rich-contextual condi- tional diffusion models,”arXiv preprint arXiv:2407.02482, 2024
-
[25]
Dual path modeling for semantic matching by perceiving subtle conflicts,
C. Xue, D. Liang, S. Wang, J. Zhang, and W. Wu, “Dual path modeling for semantic matching by perceiving subtle conflicts,” inICASSP 2023- 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5
2023
-
[26]
Searching for optimal subword tokenization in cross- domain ner,
R. Ma, Y . Tan, X. Zhou, X. Chen, D. Liang, S. Wang, W. Wu, T. Gui, and Q. Zhang, “Searching for optimal subword tokenization in cross- domain ner,”arXiv preprint arXiv:2206.03352, 2022
-
[27]
Robust lottery tickets for pre-trained language models,
R. Zheng, R. Bao, Y . Zhou, D. Liang, S. Wang, W. Wu, T. Gui, Q. Zhang, and X. Huang, “Robust lottery tickets for pre-trained language models,” arXiv preprint arXiv:2211.03013, 2022
-
[28]
Advancing pose-guided image synthesis with progressive conditional diffusion models,
F. Shen, H. Ye, J. Zhang, C. Wang, X. Han, and W. Yang, “Advancing pose-guided image synthesis with progressive conditional diffusion models,”arXiv preprint arXiv:2310.06313, 2023
-
[29]
Improving semantic matching through dependency-enhanced pre-trained model with adaptive fusion,
J. Song, D. Liang, R. Li, Y . Li, S. Wang, M. Peng, W. Wu, and Y . Yu, “Improving semantic matching through dependency-enhanced pre-trained model with adaptive fusion,” inFindings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022, pp. 45–57. [Online]. Available: h...
2022
-
[30]
DABERT: Dual attention enhanced BERT for semantic matching,
S. Wang, D. Liang, J. Song, Y . Li, and W. Wu, “DABERT: Dual attention enhanced BERT for semantic matching,” inProceedings of the 29th International Conference on Computational Linguistics, Gyeongju, Re- public of Korea: International Committee on Computational Linguistics, 2022, pp. 1645–1654. [Online]. Available: https://aclanthology.org/2022. coling-1.141
2022
-
[31]
Robust lottery tickets for pre-trained language models,
R. Zheng, B. Rong, Y . Zhou, D. Liang, S. Wang, W. Wu, T. Gui, Q. Zhang, and X. Huang, “Robust lottery tickets for pre-trained language models,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ire- land: Association for Computational Linguistics, 2022, pp. 2211–2224. [Online]. Avai...
2022
-
[32]
Cqg: A simple and effective controlled generation framework for multi- hop question generation,
Z. Fei, Q. Zhang, T. Gui, D. Liang, S. Wang, W. Wu, and X.-J. Huang, “Cqg: A simple and effective controlled generation framework for multi- hop question generation,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 6896–6906
2022
-
[33]
S3prompt: Instructing the model with self- calibration, self-recall and self-aggregation to improve in-context learn- ing,
J. Chen and J. Liu, “S3prompt: Instructing the model with self- calibration, self-recall and self-aggregation to improve in-context learn- ing,” inProceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC- COLING 2024), 2024, pp. 14 259–14 271
2024
-
[34]
Local and global: Temporal question answering via information fusion
Y . Liu, D. Liang, M. Li, F. Giunchiglia, X. Li, S. Wang, W. Wu, L. Huang, X. Feng, and R. Guan, “Local and global: Temporal question answering via information fusion.” inIJCAI, 2023, pp. 5141–5149
2023
-
[35]
Transferring from formal newswire domain with hypernet for twitter pos tagging,
T. Gui, Q. Zhang, J. Gong, M. Peng, D. Liang, K. Ding, and X.-J. Huang, “Transferring from formal newswire domain with hypernet for twitter pos tagging,” inProceedings of the 2018 conference on empirical methods in natural language processing, 2018, pp. 2540–2549
2018
-
[36]
Resolving word vagueness with scenario-guided adapter for natural language inference,
Y . Liu, M. Li, D. Liang, X. Li, F. Giunchiglia, L. Huang, X. Feng, and R. Guan, “Resolving word vagueness with scenario-guided adapter for natural language inference,”arXiv preprint arXiv:2405.12434, 2024
-
[37]
arXiv preprint arXiv:2506.04065, 2025
Wu, Muling, Qian, Qi, Liu, Wenhao, Wang, Xiaohua, Huang, Zisu, Liang, Di, Miao, LI, Dou, Shihan, Lv, Changze, Wang, Zhenghua, et al.Progressive Mastery: Customized Curriculum Learning with Guided Prompting for Mathematical Reasoning. arXiv preprint arXiv:2506.04065, 2025
-
[38]
Question calibration and multi-hop modeling for temporal question answering,
C. Xue, D. Liang, P. Wang, and J. Zhang, “Question calibration and multi-hop modeling for temporal question answering,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, 2024, pp. 19 332–19 340
2024
-
[39]
ArXivabs/2408.09174(2024),https: //api.semanticscholar.org/CorpusID:2719028395, 6
X. Wu, J. Yang, L. Chai, G. Zhang, J. Liu, X. Du, D. Liang, D. Shu, X. Cheng, T. Sunet al., “Tablebench: A comprehensive and complex benchmark for table question answering,”arXiv preprint arXiv:2408.09174, 2024
-
[40]
Chang Dai, Hongyu Shan, Mingyang Song, and Di Liang. Hope: Hyperbolic rotary positional encoding for stable long-range dependency modeling in large language models.arXiv preprint arXiv:2509.05218, 2025
-
[41]
Decorl: Decoupling reasoning chains via parallel sub-step generation and cascaded reinforcement for interpretable and scalable rlhf
Ziyuan Gao, Di Liang, Xianjie Wu, Philippe Morel, and Minlong Peng. Decorl: Decoupling reasoning chains via parallel sub-step generation and cascaded reinforcement for interpretable and scalable rlhf. InPro- ceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 30789–30797, 2026
2026
-
[42]
Liang Li, Qisheng Liao, Meiting Lai, Di Liang, and Shangsong Liang
Junchen Li, Chao Qi, Rongzheng Wang, Qizhi Chen, Liang Xu, Di Liang, Bob Simons, and Shuang Liang. When safety becomes a vulnerability: Exploiting llm alignment homogeneity for transferable blocking in rag.arXiv preprint arXiv:2603.03919, 2026
-
[43]
Parameter importance is not static: Evolving parameter isolation for supervised fine-tuning, 2026
Zekai Lin, Chao Xue, Di Liang, Xingsheng Han, Peiyang Liu, Xianjie Wu, Lei Jiang, Yu Lu, Haibo Shi, Shuang Liang, and Minlong Peng. Parameter importance is not static: Evolving parameter isolation for supervised fine-tuning, 2026
2026
-
[44]
Peiyang Liu, Ziqiang Cui, Di Liang, and Wei Ye. Who stole your data? a method for detecting unauthorized rag theft.arXiv preprint arXiv:2510.07728, 2025
-
[45]
Xiaoyu Liu, Xiaoyu Guan, Di Liang, and Xianjie Wu. Dpi: Exploiting parameter heterogeneity for interference-free fine-tuning.arXiv preprint arXiv:2601.17777, 2026
-
[46]
Structural reward model: Enhancing interpretability, efficiency, and scalability in reward modeling
Xiaoyu Liu, Di Liang, Hongyu Shan, Peiyang Liu, Yonghao Liu, Muling Wu, Yuntao Li, Xianjie Wu, Li Miao, Jiangrong Shen, et al. Structural reward model: Enhancing interpretability, efficiency, and scalability in reward modeling. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 672– 685, 2025
2025
-
[47]
Adaptive curriculum strategies: Stabilizing reinforcement learning for large language models
Qi Qian, Muling Wu, Zisu Huang, Wenhao Liu, Changze Lv, Xiaohua Wang, Zhenghua Wang, Zhengkang Guo, Zhibo Xu, Lina Chen, et al. Adaptive curriculum strategies: Stabilizing reinforcement learning for large language models
-
[48]
Rongzheng Wang, Yihong Huang, Muquan Li, Jiakai Li, Di Liang, Bob Simons, Pei Ke, Shuang Liang, and Ke Qin. Rethinking llm- driven heuristic design: Generating efficient and specialized solvers via dynamics-aware optimization.arXiv preprint arXiv:2601.20868, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[49]
Yao Wang, Di Liang, and Minlong Peng. Not all parameters are created equal: Smart isolation boosts fine-tuning performance.arXiv preprint arXiv:2508.21741, 2025
-
[50]
Breaking size barrier: Enhancing reasoning for large-size table question answering
Xianjie Wu, Di Liang, Jian Yang, Xianfu Cheng, LinZheng Chai, Tongliang Li, Liqun Yang, and Zhoujun Li. Breaking size barrier: Enhancing reasoning for large-size table question answering. InInter- national Conference on Database Systems for Advanced Applications, pages 241–256. Springer, 2025
2025
-
[51]
Mmtablebench: A multi-level multimodal benchmark for reasoning and layout complexity in table qa
Xianjie Wu, Xiaohang Xu, Tingyu Jiang, Jian Yang, Di Liang, Xianfu Cheng, Zhenhe Wu, Linzheng Chai, Wei Zhang, Jiaheng Liu, et al. Mmtablebench: A multi-level multimodal benchmark for reasoning and layout complexity in table qa. InProceedings of the ACM Web Conference 2026, pages 3881–3892, 2026
2026
-
[52]
Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty
Chao Xue, Yao Wang, Mengqiao Liu, Di Liang, Xingsheng Han, Peiyang Liu, Xianjie Wu, Chenyao Lu, Lei Jiang, Yu Lu, et al. Reason only when needed: Efficient generative reward modeling via model-internal uncertainty.arXiv preprint arXiv:2604.10072, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[53]
Chao Xue, Yao Wang, Mengqiao Liu, Di Liang, Xingsheng Han, Peiyang Liu, Xianjie Wu, Chenyao Lu, Lei Jiang, Yu Lu, et al. Why supervised fine-tuning fails to learn: A systematic study of incomplete learning in large language models.arXiv preprint arXiv:2604.10079, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[54]
Long-term talkingface generation via motion-prior conditional diffusion model,
F. Shen, C. Wang, J. Gao, Q. Guo, J. Dang, J. Tang, and T.-S. Chua, “Long-term talkingface generation via motion-prior conditional diffusion model,”arXiv preprint arXiv:2502.09533, 2025
-
[55]
IMAGPose: A unified conditional framework for pose-guided person generation,
F. Shen and J. Tang, “IMAGPose: A unified conditional framework for pose-guided person generation,” inProc. 38th Conf. on Neural Information Processing Systems (NeurIPS), 2024
2024
-
[56]
Local and global: Text matching via syntax graph calibration,
L. Li, Q. Liao, M. Lai, D. Liang, and S. Liang, “Local and global: Text matching via syntax graph calibration,” inICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024, pp. 11 571–11 575
2024
-
[57]
Unleashing potential of evidence in knowledge-intensive dialogue generation,
X. Wu, J. Yang, T. Li, S. Zhang, Y . Du, L. Chai, D. Liang, and Z. Li, “Unleashing potential of evidence in knowledge-intensive dialogue generation,” inICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2025, pp. 1–5
2025
-
[58]
Comateformer: Combined atten- tion transformer for semantic sentence matching,
B. Li, D. Liang, and Z. Zhang, “Comateformer: Combined atten- tion transformer for semantic sentence matching,”arXiv preprint arXiv:2412.07220, 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.