Chinese Short-Form Creative Content Generation via Explanation-Oriented Multi-Objective Optimization
Pith reviewed 2026-05-17 20:46 UTC · model grok-4.3
The pith
Formalizing Chinese short-form creative tasks as joint optimization of constraints and explanation reliability produces more trustworthy personalized outputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper formalizes Chinese short-form CNLG as a heterogeneous multi-objective optimization issue that jointly optimizes multiple personalized constraints and explanation reliability. It introduces MAGIC-HMO, a training-free multi-agent framework that performs iterative generation and verification under an explanation-oriented multi-objective strategy. Experiments on the Chinese Baby Naming benchmark show that MAGIC-HMO significantly outperforms six strong baselines across various LLM backbones.
What carries the argument
MAGIC-HMO, a training-free multi-agent framework that iterates between generating creative content and its explanation while verifying both against multiple personalized constraints.
If this is right
- Short creative outputs can meet diverse personalized constraints more consistently when explanations are optimized alongside the text itself.
- LLM-generated explanations become usable verification cues rather than additional sources of error.
- The same iterative verification process works across different LLM backbones without requiring fine-tuning.
- The method supplies a general template for other short-form creative tasks that suffer from limited direct observability.
Where Pith is reading between the lines
- The same explanation-oriented loop could be tested on other compact languages or on short advertising copy under similar constraint sets.
- Adding an external human review step on the generated explanations might further tighten the multi-objective balance.
- The framework suggests that explanation quality can serve as a scalable proxy signal for constraint achievement in future automated creative systems.
Load-bearing premise
Iterative multi-agent generation and verification can reliably reduce hallucination, incompleteness, and ambiguity in explanations under complex personalized constraints without introducing new failure modes.
What would settle it
A head-to-head evaluation on the Chinese Baby Naming benchmark in which MAGIC-HMO fails to outperform the six baselines on combined measures of constraint satisfaction and explanation quality would falsify the central performance claim.
Figures
read the original abstract
Chinese demonstrates high semantic compactness and rich metaphorical expressiveness, enabling limited text to convey dense meanings while increasing the difficulty of generation and verification, particularly in short-form creative natural language generation (CNLG). In the real world, users often require personalized, fine-grained creative constraints, making reliable verification critical to guiding optimization. According to Brunswik's Lens Model from psychology, constraints' achievement can be inferred from sufficient observable cues. Existing studies are mainly outcome-oriented, implicitly assuming that the outcome itself provides adequate cues for verification. However, this assumption breaks down in Chinese short-form CNLG (e.g., naming or advertising) with diverse personalized constraints, where extremely brief outcomes inherently offer limited information. Explanations can naturally serve as extra cues. Nevertheless, under complex constraints, LLMs' explanations may suffer from hallucination, incompleteness, or ambiguity. To address these, we novelly formalize the Chinese short-form CNLG task as a heterogeneous multi-objective optimization (HMO) issue that needs to jointly optimize multiple personalized constraints and explanation reliability. We further propose MAGIC-HMO, a training-free multi-agent framework that optimizes these objectives through iterative generation and verification under an explanation-oriented multi-objective strategy. Experiments on \emph{Chinese Baby Naming}, a challenging benchmark, demonstrate that MAGIC-HMO significantly outperforms six strong baselines across various LLM backbones. Relevant data and codes are available at https://github.com/foolfun/MAGIC_HMO.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that Chinese short-form creative NLG under complex personalized constraints can be formalized as a heterogeneous multi-objective optimization (HMO) problem. It proposes MAGIC-HMO, a training-free multi-agent framework that performs iterative generation and verification of both outputs and explanations, drawing on Brunswik's Lens Model to treat explanations as additional observable cues. Experiments on the Chinese Baby Naming benchmark are reported to show that MAGIC-HMO significantly outperforms six strong baselines across multiple LLM backbones.
Significance. If the reported gains are substantiated with detailed metrics and controls, the work would offer a practical, training-free route to improving reliability in subjective creative generation tasks where outcomes are too brief to serve as self-sufficient verification signals. The explicit multi-objective framing and emphasis on explanation quality distinguish it from purely outcome-oriented prompting methods.
major comments (2)
- [Experiments] Experiments section: the central claim of significant outperformance on Chinese Baby Naming is stated without effect sizes, confidence intervals, statistical significance tests, or per-metric breakdowns (e.g., hallucination rate, completeness, or human-rated explanation quality). This absence makes it impossible to evaluate whether the iterative verification loop produces the claimed reductions in hallucination and ambiguity or merely shifts failure modes.
- [MAGIC-HMO Framework] MAGIC-HMO framework description: the iterative generation-verification procedure is presented as reliably mitigating hallucination, incompleteness, and ambiguity under personalized constraints, yet no quantitative diagnostics (per-iteration hallucination rates, inter-agent agreement, or ablation of the verification agent) are supplied. In a domain without objective ground truth, this leaves the load-bearing assumption untested.
minor comments (2)
- [Abstract] Abstract: the phrase 'significantly outperforms' is used without defining the metric or referencing the statistical procedure that supports the adverb.
- [Introduction / Problem Formulation] Notation: the distinction between the heterogeneous objectives and the explanation-reliability objective is introduced but not given explicit mathematical formulation or weighting scheme in the early sections.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, indicating the specific revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [Experiments] Experiments section: the central claim of significant outperformance on Chinese Baby Naming is stated without effect sizes, confidence intervals, statistical significance tests, or per-metric breakdowns (e.g., hallucination rate, completeness, or human-rated explanation quality). This absence makes it impossible to evaluate whether the iterative verification loop produces the claimed reductions in hallucination and ambiguity or merely shifts failure modes.
Authors: We agree that the current presentation of results would be strengthened by additional statistical details and metric breakdowns. In the revised manuscript we will report effect sizes, 95% confidence intervals, and statistical significance tests (paired t-tests or Wilcoxon signed-rank tests as appropriate) for all main comparisons across LLM backbones. We will also add explicit per-metric tables covering hallucination rate, completeness, and human-rated explanation quality. These analyses, drawn from the existing evaluation data already collected for the Chinese Baby Naming benchmark, show consistent reductions in hallucination and ambiguity rather than simple failure-mode shifts; the revised tables and figures will make this evidence directly accessible to readers. revision: yes
-
Referee: [MAGIC-HMO Framework] MAGIC-HMO framework description: the iterative generation-verification procedure is presented as reliably mitigating hallucination, incompleteness, and ambiguity under personalized constraints, yet no quantitative diagnostics (per-iteration hallucination rates, inter-agent agreement, or ablation of the verification agent) are supplied. In a domain without objective ground truth, this leaves the load-bearing assumption untested.
Authors: We accept that quantitative diagnostics are needed to support the claims about the iterative loop. The revised manuscript will include per-iteration hallucination-rate curves, inter-agent agreement statistics (Cohen’s kappa on verification decisions), and a dedicated ablation that removes the verification agent while keeping all other components fixed. Although the creative-naming domain lacks objective ground truth, the benchmark relies on expert human judgments; we will expand the description of these judgments and their reliability in the revision. These additions will directly test the contribution of the verification step and the explanation-oriented multi-objective strategy. revision: yes
Circularity Check
No significant circularity; empirical framework evaluated externally
full rationale
The paper formalizes Chinese short-form CNLG as a heterogeneous multi-objective optimization problem drawing on Brunswik's Lens Model and proposes the training-free MAGIC-HMO multi-agent framework using iterative generation and verification. All reported gains are measured via experiments on the external Chinese Baby Naming benchmark against six independent baselines across LLM backbones. No equations, fitted parameters, or self-citations reduce the claimed outperformance to quantities defined by the authors' own inputs or prior work; the derivation chain remains self-contained with success determined by external comparison rather than construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM-generated explanations can serve as observable cues for constraint achievement per Brunswik's Lens Model
- ad hoc to paper Iterative multi-agent generation and verification reduces hallucination and ambiguity without new failure modes
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We novelly formalize the Chinese short-form CNLG task as a heterogeneous multi-objective optimization (HMOO) issue that needs to jointly optimize multiple personalized constraints and explanation reliability... MAGIC-HMO, a training-free multi-agent framework that optimizes these objectives through iterative generation and verification
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
θ_imp = W_imp · S_imp ... dynamic iteration process... ψ_imp = δ if j_imp < t_w else α·log(j_imp + t_w)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
OpenAI, “Hello gpt-4o,” https://openai.com/index/hello-gpt-4o/, 2024, accessed: 2025-01-29
work page 2024
-
[2]
——, “Introducing openai o1,” https://openai.com/o1/, 2024, accessed: 2024-10-28
work page 2024
-
[3]
Gemini: A Family of Highly Capable Multimodal Models
Gemini, R. Anil, S. Borgeaud, J.-B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millican, D. Silver, M. Johnson, I. Antonoglou, J. Schrittwieser, A. Glaese, J. Chen, E. Pitler, T. Lillicrap, A. Lazaridou, O. Firat, J. Molloyet al., “Gemini: A family of highly capable multimodal models,” 2024. [Online]. Available: https://arxiv.org/ab...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[4]
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
D. Guo, D. Yang, H. Zhang, J. Song, R. Zhang, R. Xu, Q. Zhu, S. Ma, P. Wang, X. Biet al., “Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning,”arXiv preprint arXiv:2501.12948, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[5]
MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark
Y . Wang, X. Ma, G. Zhang, Y . Ni, A. Chandra, S. Guo, W. Ren, A. Arulraj, X. He, Z. Jiang, T. Li, M. Ku, K. Wang, A. Zhuang, R. Fan, X. Yue, and W. Chen, “MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark,” Nov. 2024, arXiv:2406.01574 [cs]. [Online]. Available: http://arxiv.org/abs/2406.01574
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[6]
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi,
X. Yue, Y . Ni, K. Zhang, T. Zheng, R. Liu, G. Zhang, S. Stevens, D. Jiang, W. Ren, Y . Sunet al., “Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 9556–9567
work page 2024
-
[7]
MathPrompter: Mathematical reasoning using large language models,
S. Imani, L. Du, and H. Shrivastava, “MathPrompter: Mathematical reasoning using large language models,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), S. Sitaram, B. Beigman Klebanov, and J. D. Williams, Eds. Toronto, Canada: Association for Computational Linguistics, Jul. 2023, pp. 3...
work page 2023
-
[8]
Hi-tom: A benchmark for evaluating higher-order theory of mind reasoning in large language models,
Y . He, Y . Wu, Y . Jia, R. Mihalcea, Y . Chen, and N. Deng, “Hi-tom: A benchmark for evaluating higher-order theory of mind reasoning in large language models,”arXiv preprint arXiv:2310.16755, 2023
-
[9]
Understanding social reasoning in language models with language models,
K. Gandhi, J.-P. Fraenken, T. Gerstenberg, and N. Goodman, “Understanding social reasoning in language models with language models,” inAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 13 518–13 529. [Online]. Available: https://proceedi...
work page 2023
-
[10]
X. Sun, K. Shi, H. Tang, D. Wang, G. Xu, and Q. Li, “Educating lan- guage models as promoters: Multi-aspect instruction alignment with self- augmentation,”IEEE Transactions on Knowledge and Data Engineering, vol. 37, no. 8, pp. 4564–4577, 2025
work page 2025
-
[11]
Controllable text generation for open-domain creativity and fairness,
N. Peng, “Controllable text generation for open-domain creativity and fairness,”arXiv preprint arXiv:2209.12099, 2022
-
[12]
Creative natural language generation,
T. Chakrabarty, V . Padmakumar, H. He, and N. Peng, “Creative natural language generation,” inProceedings of the 2023 Conference on Empir- ical Methods in Natural Language Processing: Tutorial Abstracts, 2023, pp. 34–40
work page 2023
-
[13]
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher, “Ctrl: A conditional transformer language model for controllable gen- eration,”arXiv preprint arXiv:1909.05858, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1909
-
[14]
Controllable natural language generation with contrastive prefixes,
J. Qian, L. Dong, Y . Shen, F. Wei, and W. Chen, “Controllable natural language generation with contrastive prefixes,” inFindings of the Association for Computational Linguistics: ACL 2022, S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 2912–2924. [Online]. Available: https://acla...
work page 2022
-
[15]
A survey on evaluation of large language models,
Y . Chang, X. Wang, J. Wang, Y . Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y . Wanget al., “A survey on evaluation of large language models,”ACM Transactions on Intelligent Systems and Technology, vol. 15, no. 3, pp. 1–45, 2024
work page 2024
-
[16]
” it felt like having a second mind
Q. Wan, S. Hu, Y . Zhang, P. Wang, B. Wen, and Z. Lu, “” it felt like having a second mind”: Investigating human-ai co-creativity in prewriting with large language models,”Proceedings of the ACM on Human-Computer Interaction, vol. 8, no. CSCW1, pp. 1–26, 2024
work page 2024
-
[17]
T. Chakrabarty, V . Padmakumar, F. Brahman, and S. Muresan, “Cre- ativity support in the age of large language models: An empirical study involving emerging writers,”arXiv preprint arXiv:2309.12570, 2023
-
[18]
Jiuge: A human-machine collaborative chinese classical poetry generation system,
G. Zhipeng, X. Yi, M. Sun, W. Li, C. Yang, J. Liang, H. Chen, Y . Zhang, and R. Li, “Jiuge: A human-machine collaborative chinese classical poetry generation system,” inProceedings of the 57th annual meeting of the association for computational linguistics: system demonstrations, 2019, pp. 25–30
work page 2019
-
[19]
Charpoet: A chinese classical poetry generation system based on token-free llm,
C. Yu, L. Zang, J. Wang, C. Zhuang, and J. Gu, “Charpoet: A chinese classical poetry generation system based on token-free llm,” inProceed- ings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2024, pp. 315–325
work page 2024
-
[20]
Poetry in rags: Modern greek inter- war poetry generation using rag and contrastive training,
S. Chatzikyriakidis and A. Natsina, “Poetry in rags: Modern greek inter- war poetry generation using rag and contrastive training,” inProceedings of the 5th International Conference on Natural Language Processing for Digital Humanities, 2025, pp. 257–264
work page 2025
-
[21]
Collabstory: Multi-llm collaborative story generation and authorship analysis,
S. Venkatraman, N. I. Tripto, and D. Lee, “Collabstory: Multi-llm collaborative story generation and authorship analysis,”arXiv preprint arXiv:2406.12665, 2024
-
[22]
Seed- story: Multimodal long story generation with large language model,
S. Yang, Y . Ge, Y . Li, Y . Chen, Y . Ge, Y . Shan, and Y . Chen, “Seed- story: Multimodal long story generation with large language model,” arXiv preprint arXiv:2407.08683, 2024
-
[23]
Summary of a haystack: A challenge to long-context llms and rag systems,
P. Laban, A. R. Fabbri, C. Xiong, and C.-S. Wu, “Summary of a haystack: A challenge to long-context llms and rag systems,” in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024, pp. 9885–9903
work page 2024
-
[24]
H. Jin, Y . Zhang, D. Meng, J. Wang, and J. Tan, “A comprehensive sur- vey on process-oriented automatic text summarization with exploration of llm-based methods,”arXiv preprint arXiv:2403.02901, 2024
-
[25]
Unified multi-scenario summarization evaluation and explanation,
S. Shang, Z. Yao, H. Fu, C. Tao, X. Chen, F. Wang, Y . Wang, Z. Ren, and S. Gao, “Unified multi-scenario summarization evaluation and explanation,”IEEE Transactions on Knowledge and Data Engineering, vol. 37, no. 2, pp. 991–1003, 2025
work page 2025
-
[26]
Enhancing coherence and diversity in multi-class slogan generation systems,
P. N. Ahmad, Y . Liu, I. Ullah, and M. Shabaz, “Enhancing coherence and diversity in multi-class slogan generation systems,”ACM Transactions on Asian and Low-Resource Language Information Processing, vol. 23, no. 8, pp. 1–24, 2024
work page 2024
-
[27]
Deep poetry: A chinese classical poetry generation system,
Y . Liu, D. Liu, and J. Lv, “Deep poetry: A chinese classical poetry generation system,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 09, 2020, pp. 13 626–13 627
work page 2020
-
[28]
Chae: Fine-grained con- trollable story generation with characters, actions and emotions,
X. Wang, H. Jiang, Z. Wei, and S. Zhou, “Chae: Fine-grained con- trollable story generation with characters, actions and emotions,” in Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 6426–6435
work page 2022
-
[29]
Weaver: Foundation models for creative writing,
T. Wang, J. Chen, Q. Jia, S. Wang, R. Fang, H. Wang, Z. Gao, C. Xie, C. Xu, J. Daiet al., “Weaver: Foundation models for creative writing,” arXiv preprint arXiv:2401.17268, 2024
-
[30]
L.-C. Lu, S.-J. Chen, T.-M. Pai, C.-H. Yu, H.-y. Lee, and S.-H. Sun, “Llm discussion: Enhancing the creativity of large language models via discussion framework and role-play,”arXiv preprint arXiv:2405.06373, 2024
-
[31]
Controlled text gen- eration as continuous optimization with multiple constraints,
S. Kumar, E. Malmi, A. Severyn, and Y . Tsvetkov, “Controlled text gen- eration as continuous optimization with multiple constraints,”Advances in Neural Information Processing Systems, vol. 34, pp. 14 542–14 554, 2021
work page 2021
-
[32]
Position: A roadmap to pluralistic alignment,
T. Sorensen, J. Moore, J. Fisher, M. L. Gordon, N. Mireshghallah, C. M. Rytting, A. Ye, L. Jiang, X. Lu, N. Dziriet al., “Position: A roadmap to pluralistic alignment,” inForty-first International Conference on Machine Learning, 2024
work page 2024
-
[33]
Suri: Multi-constraint instruction following in long-form text generation,
C. M. Pham, S. Sun, and M. Iyyer, “Suri: Multi-constraint instruction following in long-form text generation,” inFindings of the Association for Computational Linguistics: EMNLP 2024, Y . Al- Onaizan, M. Bansal, and Y .-N. Chen, Eds. Miami, Florida, USA: Association for Computational Linguistics, Nov. 2024, pp. 1722–1753. [Online]. Available: https://acla...
work page 2024
-
[34]
Followbench: A multi-level fine- grained constraints following benchmark for large language models,
Y . Jiang, Y . Wang, X. Zeng, W. Zhong, L. Li, F. Mi, L. Shang, X. Jiang, Q. Liu, and W. Wang, “Followbench: A multi-level fine- grained constraints following benchmark for large language models,” arXiv preprint arXiv:2310.20410, 2023
-
[35]
T. B. Ward, “What’s old about new ideas,”The creative cognition approach, pp. 157–178, 1995
work page 1995
-
[36]
R. A. Finke, T. B. Ward, and S. M. Smith,Creative cognition: Theory, research, and applications. MIT press, 1996
work page 1996
-
[37]
Implicit motives and basic psychological needs,
J. Sch ¨uler, N. Baumann, A. Chasiotis, M. Bender, and I. Baum, “Implicit motives and basic psychological needs,”Journal of personality, vol. 87, no. 1, pp. 37–55, 2019
work page 2019
-
[38]
On the creativity of large language models,
G. Franceschelli and M. Musolesi, “On the creativity of large language models,”AI & SOCIETY, pp. 1–11, 2024
work page 2024
-
[39]
Art or artifice? large language models and the false promise of creativity,
T. Chakrabarty, P. Laban, D. Agarwal, S. Muresan, and C.-S. Wu, “Art or artifice? large language models and the false promise of creativity,” JOURNAL OF LATEX CLASS FILES. 12 inProceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–34
work page 2024
-
[40]
Generative ai lacks the human creativity to achieve scientific discovery from scratch,
A. W. Ding and S. Li, “Generative ai lacks the human creativity to achieve scientific discovery from scratch,”Scientific Reports, vol. 15, no. 1, p. 9587, 2025
work page 2025
-
[41]
Mixpoet: Diverse poetry generation via learning controllable mixed latent space,
X. Yi, R. Li, C. Yang, W. Li, and M. Sun, “Mixpoet: Diverse poetry generation via learning controllable mixed latent space,” inProceedings of the AAAI conference on artificial intelligence, vol. 34, no. 05, 2020, pp. 9450–9457
work page 2020
-
[42]
Evaluating creative short story generation in humans and large language models,
M. Ismayilzada, C. Stevenson, and L. van der Plas, “Evaluating creative short story generation in humans and large language models,”arXiv preprint arXiv:2411.02316, 2024
-
[43]
G. Marco, L. Rello, and J. Gonzalo, “Small language models can out- perform humans in short creative writing: A study comparing slms with humans and llms,” inProceedings of the 31st International Conference on Computational Linguistics, 2025, pp. 6552–6570
work page 2025
-
[44]
H. Tan, Z. Guo, Z. Shi, L. Xu, Z. Liu, Y . Feng, X. Li, Y . Wang, L. Shang, Q. Liu, and L. Song, “ProxyQA: An alternative framework for evaluating long-form text generation with large language models,” inProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), L.-W. Ku, A. Martins, and V . Srikumar, ...
work page 2024
-
[45]
Reasoning-enhanced self-training for long-form personalized text generation,
A. Salemi, C. Li, M. Zhang, Q. Mei, W. Kong, T. Chen, Z. Li, M. Ben- dersky, and H. Zamani, “Reasoning-enhanced self-training for long-form personalized text generation,”arXiv preprint arXiv:2501.04167, 2025
-
[46]
A distributional approach to controlled text generation,
M. Khalifa, H. Elsahar, and M. Dymetman, “A distributional approach to controlled text generation,” inInternational Conference on Learning Representations, 2020
work page 2020
-
[47]
A distributional lens for multi-aspect controllable text generation,
Y . Gu, X. Feng, S. Ma, L. Zhang, H. Gong, and B. Qin, “A distributional lens for multi-aspect controllable text generation,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 1023–1043
work page 2022
-
[48]
An extensible plug-and-play method for multi-aspect controllable text generation,
X. Huang, Z. Liu, P. Li, T. Li, M. Sun, and Y . Liu, “An extensible plug-and-play method for multi-aspect controllable text generation,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 15 233– 15 256
work page 2023
-
[49]
Maclasa: Multi-aspect controllable text generation via efficient sampling from compact latent space,
H. Ding, L. Pang, Z. Wei, H. Shen, X. Cheng, and T.-S. Chua, “Maclasa: Multi-aspect controllable text generation via efficient sampling from compact latent space,” inFindings of the Association for Computational Linguistics: EMNLP 2023, 2023, pp. 4424–4436
work page 2023
-
[50]
Controllable text generation via probability density estimation in the latent space,
Y . Gu, X. Feng, S. Ma, L. Zhang, H. Gong, W. Zhong, and B. Qin, “Controllable text generation via probability density estimation in the latent space,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), A. Rogers, J. Boyd-Graber, and N. Okazaki, Eds. Toronto, Canada: Association for Computati...
work page 2023
-
[51]
Tara: Token-level attribute relation adaptation for multi-attribute controllable text gener- ation,
Y . Cao, J. Zhao, R. Zhang, H. Zou, and W. Mao, “Tara: Token-level attribute relation adaptation for multi-attribute controllable text gener- ation,” inFindings of the Association for Computational Linguistics: EMNLP 2024, 2024, pp. 12 570–12 579
work page 2024
-
[52]
A review of multi-objective optimization: Methods and its applications,
N. Gunantara, “A review of multi-objective optimization: Methods and its applications,”Cogent Engineering, vol. 5, no. 1, p. 1502242, 2018
work page 2018
-
[53]
Mix and match: Learning-free controllable text generationusing energy language models,
F. Mireshghallah, K. Goyal, and T. Berg-Kirkpatrick, “Mix and match: Learning-free controllable text generationusing energy language models,” inProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), S. Muresan, P. Nakov, and A. Villavicencio, Eds. Dublin, Ireland: Association for Computational Ling...
work page 2022
-
[54]
Cold decoding: energy- based constrained text generation with langevin dynamics,
L. Qin, S. Welleck, D. Khashabi, and Y . Choi, “Cold decoding: energy- based constrained text generation with langevin dynamics,” inPro- ceedings of the 36th International Conference on Neural Information Processing Systems, 2022, pp. 9538–9551
work page 2022
-
[55]
Prefix-tuning: Optimizing continuous prompts for generation,
X. L. Li and P. Liang, “Prefix-tuning: Optimizing continuous prompts for generation,” inProceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), C. Zong, F. Xia, W. Li, and R. Navigli, Eds. Online: Association for Computationa...
work page 2021
-
[56]
Suri: Multi-constraint in- struction following for long-form text generation,
C. M. Pham, S. Sun, and M. Iyyer, “Suri: Multi-constraint in- struction following for long-form text generation,”arXiv preprint arXiv:2406.19371, 2024
-
[57]
Benchmarking complex instruction-following with multiple constraints composition,
B. Wen, P. Ke, X. Gu, L. Wu, H. Huang, J. Zhou, W. Li, B. Hu, W. Gao, J. Xuet al., “Benchmarking complex instruction-following with multiple constraints composition,”Advances in Neural Information Processing Systems, vol. 37, pp. 137 610–137 645, 2024
work page 2024
-
[58]
Q. He, J. Zeng, Q. He, J. Liang, and Y . Xiao, “From complex to simple: Enhancing multi-constraint complex instruction following ability of large language models,”arXiv preprint arXiv:2404.15846, 2024
-
[59]
J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkatet al., “Gpt-4 technical report,”arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[60]
CAMEL: Communicative agents for
G. Li, H. A. A. K. Hammoud, H. Itani, D. Khizbullin, and B. Ghanem, “CAMEL: Communicative agents for ”mind” exploration of large language model society,” inThirty-seventh Conference on Neural Information Processing Systems, 2023. [Online]. Available: https://openreview.net/forum?id=3IyL2XWDkG
work page 2023
-
[61]
Autogen: Enabling next-gen LLM applications via multi-agent conversation,
Q. Wu, G. Bansal, J. Zhang, Y . Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, D. Burger, and C. Wang, “Autogen: Enabling next-gen LLM applications via multi-agent conversation,” 2024. [Online]. Available: https://openreview.net/forum?id=tEAF9LBdgu
work page 2024
-
[62]
MetaGPT: Meta programming for a multi-agent collaborative framework,
S. Hong, M. Zhuge, J. Chen, X. Zheng, Y . Cheng, J. Wang, C. Zhang, Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, C. Ran, L. Xiao, C. Wu, and J. Schmidhuber, “MetaGPT: Meta programming for a multi-agent collaborative framework,” inThe Twelfth International Conference on Learning Representations, 2024. [Online]. Available: https://openreview.net/forum?id=VtmBAGCN7o
work page 2024
-
[63]
Rethinking the role of demonstrations: What makes in-context learning work?
S. Min, X. Lyu, A. Holtzman, M. Artetxe, M. Lewis, H. Hajishirzi, and L. Zettlemoyer, “Rethinking the role of demonstrations: What makes in-context learning work?” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Computa...
work page 2022
-
[64]
React: Synergizing reasoning and acting in language models,
S. Yao, J. Zhao, D. Yu, I. Shafran, K. R. Narasimhan, and Y . Cao, “React: Synergizing reasoning and acting in language models,” inNeurIPS 2022 Foundation Models for Decision Making Workshop, 2022
work page 2022
-
[65]
Reflexion: language agents with verbal reinforcement learning,
N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao, “Reflexion: language agents with verbal reinforcement learning,” in Proceedings of the 37th International Conference on Neural Information Processing Systems, 2023, pp. 8634–8652
work page 2023
-
[66]
Rest meets react: Self-improvement for multi-step reasoning llm agent,
R. Aksitov, S. Miryoosefi, Z. Li, D. Li, S. Babayan, K. Kopparapu, Z. Fisher, R. Guo, S. Prakash, P. Srinivasanet al., “Rest meets react: Self-improvement for multi-step reasoning llm agent,” inICLR 2024 Workshop on Large Language Model (LLM) Agents, 2024
work page 2024
-
[67]
Retrieval- augmented generation for knowledge-intensive nlp tasks,
P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. K ¨uttler, M. Lewis, W.-t. Yih, T. Rockt ¨aschelet al., “Retrieval- augmented generation for knowledge-intensive nlp tasks,”Advances in neural information processing systems, vol. 33, pp. 9459–9474, 2020
work page 2020
-
[68]
M. M. Abdollah Pour, A. Pesaranghader, E. Cohen, and S. Sanner, “Gaussian process optimization for adaptable multi-objective text generation using linearly-weighted language models,” inFindings of the Association for Computational Linguistics: NAACL 2024, K. Duh, H. Gomez, and S. Bethard, Eds. Mexico City, Mexico: Association for Computational Linguistics...
work page 2024
- [69]
-
[70]
Improving factuality and reasoning in language models through multiagent debate,
Y . Du, S. Li, A. Torralba, J. B. Tenenbaum, and I. Mordatch, “Improving factuality and reasoning in language models through multiagent debate,” inProceedings of the 41st International Conference on Machine Learn- ing, ser. Proceedings of Machine Learning Research, R. Salakhutdinov, Z. Kolter, K. Heller, A. Weller, N. Oliver, J. Scarlett, and F. Berkenkam...
work page 2024
-
[71]
Generative agents: Interactive simulacra of human behavior,
J. S. Park, J. O’Brien, C. J. Cai, M. R. Morris, P. Liang, and M. S. Bernstein, “Generative agents: Interactive simulacra of human behavior,” inProceedings of the 36th annual acm symposium on user interface software and technology, 2023, pp. 1–22
work page 2023
-
[72]
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
Y . Xu, S. Wang, P. Li, F. Luo, X. Wang, W. Liu, and Y . Liu, “Exploring large language models for communication games: An empirical study on werewolf,”arXiv preprint arXiv:2309.04658, 2023
-
[73]
Z. Wang, S. Mao, W. Wu, T. Ge, F. Wei, and H. Ji, “Unleashing the emergent cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration,” inProceedings of the 2024 Conference of the North American Chapter of the Association for JOURNAL OF LATEX CLASS FILES. 13 Computational Linguistics: Human Language Technolog...
work page 2024
-
[74]
A survey on llm-based multi-agent system: Recent advances and new frontiers in application,
S. Chen, Y . Liu, W. Han, W. Zhang, and T. Liu, “A survey on llm-based multi-agent system: Recent advances and new frontiers in application,”
-
[75]
Available: https://arxiv.org/abs/2412.17481
[Online]. Available: https://arxiv.org/abs/2412.17481
-
[76]
Large lan- guage models are zero-shot reasoners,
T. Kojima, S. S. Gu, M. Reid, Y . Matsuo, and Y . Iwasawa, “Large lan- guage models are zero-shot reasoners,”Advances in neural information processing systems, vol. 35, pp. 22 199–22 213, 2022
work page 2022
-
[77]
Large Language Models as Optimizers
C. Yang, X. Wang, Y . Lu, H. Liu, Q. V . Le, D. Zhou, and X. Chen, “Large language models as optimizers,”arXiv preprint arXiv:2309.03409, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[78]
Language mod- els are few-shot learners,
T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askellet al., “Language mod- els are few-shot learners,”Advances in neural information processing systems, vol. 33, pp. 1877–1901, 2020
work page 1901
-
[79]
Query expansion by prompting large language models
R. Jagerman, H. Zhuang, Z. Qin, X. Wang, and M. Bendersky, “Query expansion by prompting large language models,”arXiv preprint arXiv:2305.03653, 2023
-
[80]
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
C.-M. Chan, W. Chen, Y . Su, J. Yu, W. Xue, S. Zhang, J. Fu, and Z. Liu, “Chateval: Towards better llm-based evaluators through multi- agent debate,”arXiv preprint arXiv:2308.07201, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.