Controllable text generation for large language models: A survey.arXiv preprint arXiv:2408.12599

Xun Liang et al · 2024 · arXiv 2408.12599

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving

cs.CL · 2026-04-23 · unverdicted · novelty 7.0

OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve performance.

From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents

cs.CL · 2026-04-21 · unverdicted · novelty 7.0

Memora benchmark and FAMA metric show that LLMs and memory agents frequently reuse invalid memories and struggle to reconcile evolving information in long-term interactions.

BiST: A Gold Standard Bangla-English Bilingual Corpus for Sentence Structure and Tense Classification with Inter-Annotator Agreement

cs.CL · 2026-04-06 · unverdicted · novelty 7.0

BiST is a curated Bangla-English corpus of 30,534 sentences with annotations for syntactic structure and tense, achieving Fleiss Kappa scores of 0.82 and 0.88.

Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment

cs.LG · 2026-04-27 · unverdicted · novelty 6.0

Meta-Aligner introduces a meta-learner network that produces dynamic preference weights to enable bidirectional optimization between preferences and LLM policy responses for multi-objective alignment.

Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving

cs.CL · 2026-04-22 · unverdicted · novelty 6.0

DCM-Agent improves LLM performance on multi-paradigm optimization problems by 11-21% via dual-cluster memory construction and dynamic inference guidance.

Universally Empowering Zeroth-Order Optimization via Adaptive Layer-wise Sampling

cs.LG · 2026-04-20 · unverdicted · novelty 6.0

AdaLeZO uses a non-stationary multi-armed bandit to adaptively allocate perturbation budget across layers in zeroth-order optimization and applies inverse probability weighting to reduce variance while preserving unbiased gradients, delivering 1.7x-3.0x wall-clock speedup on LLaMA and OPT models.

Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs

cs.CL · 2026-05-08 · conditional · novelty 5.0 · 2 refs

EngGPT2MoE-16B-A3B matches or exceeds other Italian open-source LLMs on most international benchmarks while remaining competitive on ITALIC, though it trails some top international models.

An Empirical Study of Perceptions of General LLMs and Multimodal LLMs on Hugging Face

cs.SE · 2026-04-07 · unverdicted · novelty 4.0

Hugging Face discussions show that access barriers, output quality, and setup complexity are the main user concerns for both general and multimodal LLMs.

Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

cs.CL · 2026-05-13 · conditional · novelty 2.0 · 2 refs

DExperts reaches 100% safety on explicit toxicity benchmarks but only 98.5% on implicit hate speech from ToxiGen while imposing a 10x latency increase on GPT-2.

citing papers explorer

Showing 9 of 9 citing papers.

OptiVerse: A Comprehensive Benchmark towards Optimization Problem Solving cs.CL · 2026-04-23 · unverdicted · none · ref 110
OptiVerse is a new benchmark spanning neglected optimization domains that shows LLMs suffer sharp accuracy drops on hard problems due to modeling and logic errors, with a Dual-View Auditor Agent proposed to improve performance.
From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents cs.CL · 2026-04-21 · unverdicted · none · ref 6
Memora benchmark and FAMA metric show that LLMs and memory agents frequently reuse invalid memories and struggle to reconcile evolving information in long-term interactions.
BiST: A Gold Standard Bangla-English Bilingual Corpus for Sentence Structure and Tense Classification with Inter-Annotator Agreement cs.CL · 2026-04-06 · unverdicted · none · ref 2
BiST is a curated Bangla-English corpus of 30,534 sentences with annotations for syntactic structure and tense, achieving Fleiss Kappa scores of 0.82 and 0.88.
Meta-Aligner: Bidirectional Preference-Policy Optimization for Multi-Objective LLMs Alignment cs.LG · 2026-04-27 · unverdicted · none · ref 9
Meta-Aligner introduces a meta-learner network that produces dynamic preference weights to enable bidirectional optimization between preferences and LLM policy responses for multi-objective alignment.
Dual-Cluster Memory Agent: Resolving Multi-Paradigm Ambiguity in Optimization Problem Solving cs.CL · 2026-04-22 · unverdicted · none · ref 94
DCM-Agent improves LLM performance on multi-paradigm optimization problems by 11-21% via dual-cluster memory construction and dynamic inference guidance.
Universally Empowering Zeroth-Order Optimization via Adaptive Layer-wise Sampling cs.LG · 2026-04-20 · unverdicted · none · ref 53
AdaLeZO uses a non-stationary multi-armed bandit to adaptively allocate perturbation budget across layers in zeroth-order optimization and applies inverse probability weighting to reduce variance while preserving unbiased gradients, delivering 1.7x-3.0x wall-clock speedup on LLaMA and OPT models.
Benchmarking EngGPT2-16B-A3B against Comparable Italian and International Open-source LLMs cs.CL · 2026-05-08 · conditional · none · ref 6 · 2 links
EngGPT2MoE-16B-A3B matches or exceeds other Italian open-source LLMs on most international benchmarks while remaining competitive on ITALIC, though it trails some top international models.
An Empirical Study of Perceptions of General LLMs and Multimodal LLMs on Hugging Face cs.SE · 2026-04-07 · unverdicted · none · ref 52
Hugging Face discussions show that access barriers, output quality, and setup complexity are the main user concerns for both general and multimodal LLMs.
Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study cs.CL · 2026-05-13 · conditional · none · ref 23 · 2 links
DExperts reaches 100% safety on explicit toxicity benchmarks but only 98.5% on implicit hate speech from ToxiGen while imposing a 10x latency increase on GPT-2.

Controllable text generation for large language models: A survey.arXiv preprint arXiv:2408.12599

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer