hub

LLM tropes: Revealing fine-grained values and opinions in large language models

URL https://aclanthology · 2025 · DOI 10.18653/v1/2024.findings-emnlp

17 Pith papers cite this work, alongside 3 external citations. Polarity classification is still indexing.

17 Pith papers citing it

3 external citations · Crossref

open at publisher browse 17 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 2 dataset 1

citation-polarity summary

background 2 use dataset 1

representative citing papers

On Stable Long-Form Generation: Benchmarking and Mitigating Length Volatility

cs.CL · 2026-05-02 · unverdicted · novelty 7.0

VOLTBench quantifies length volatility in LLM long-form generation; GLoBo, a logits-boosting decoder, increases mean length by 148% and cuts volatility by 69% while preserving quality.

What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features

cs.CL · 2026-04-06 · unverdicted · novelty 7.0

Effective multilingual reasoning in large models relies on language-specific patterns in reasoning features rather than uniform English-like traces.

MobiFlow: Real-World Mobile Agent Benchmarking through Trajectory Fusion

cs.AI · 2026-02-28 · unverdicted · novelty 7.0

MobiFlow is a new evaluation framework for mobile agents using trajectory fusion on 240 tasks across 20 third-party apps, achieving higher alignment with human judgments than prior benchmarks.

Bayesian Preference Learning for Test-Time Steerable Reward Models

cs.LG · 2026-02-09 · unverdicted · novelty 7.0

ICRM casts reward modeling as amortized variational inference over a latent preference probability with a Beta prior, enabling test-time adaptation to unseen preferences and improving benchmark performance.

Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

LLMs exhibit pseudo-deliberation, with consistent value-action misalignment in generated dialogues despite reasoning, as measured by the new VALDI framework across 4941 scenarios.

ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding

cs.LG · 2026-04-16 · unverdicted · novelty 6.0

ConfLayers dynamically skips LLM layers based on confidence scores to create adaptive draft models for self-speculative decoding, reporting up to 1.4x speedup over standard generation.

GRM: Utility-Aware Jailbreak Attacks on Audio LLMs via Gradient-Ratio Masking

cs.SD · 2026-04-10 · unverdicted · novelty 6.0

GRM ranks Mel bands by attack contribution versus utility sensitivity, perturbs a subset, and learns a universal perturbation to reach 88.46% average jailbreak success rate with improved attack-utility trade-off on four audio LLMs.

Learning to Stay Safe: Adaptive Regularization Against Safety Degradation during Fine-Tuning

cs.CL · 2026-02-19 · unverdicted · novelty 6.0

Adaptive regularization guided by training-time safety risk signals from judges or activations prevents safety degradation in fine-tuned language models while preserving utility.

Enhancing Table Reasoning with Deterministic Table-State Rewards

cs.AI · 2026-01-30 · unverdicted · novelty 6.0

RE-TAB uses a deterministic LCS-based table-state reward for stepwise guidance and test-time scaling, raising LLM table-reasoning accuracy by 26.7 pp on average across six backbones and three benchmarks.

Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values

cs.CY · 2025-05-30 · unverdicted · novelty 6.0

LLMs deviate from human moral preferences in kidney allocation scenarios and rarely express indecision, though low-rank fine-tuning with few examples can improve both consistency and uncertainty calibration.

Dynamic Scaled Gradient Descent for Stable Fine-Tuning for Classifications

cs.LG · 2026-04-30 · unverdicted · novelty 5.0

Dynamic scaled gradient descent prevents fine-tuning collapse by dynamically down-weighting gradients of correct examples, yielding lower performance variance and higher accuracy than standard methods on classification benchmarks.

A Reproducibility Study of LLM-Based Query Reformulation

cs.IR · 2026-04-30 · unverdicted · novelty 5.0

A unified evaluation finds LLM query reformulation gains are strongly conditioned on retrieval paradigm, do not consistently transfer to neural retrievers, and are not uniformly improved by larger LLMs.

CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation

cs.MA · 2026-04-10 · unverdicted · novelty 5.0

LLM agents in an opposing-incentive NYC simulation develop limited selective trust and deception through KTO policy updates but stay 70% susceptible to adversarial persuasion.

A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction

cs.CL · 2026-03-16 · conditional · novelty 5.0

Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.

AI Evaluation Should Require Standardized Item-Level Data Releases

cs.AI · 2026-02-27 · conditional · novelty 5.0 · 2 refs

AI benchmark evaluations require standardized item-level data releases as core infrastructure to support validity assessment, demonstrated via the OpenEval archive of 10M responses across 155k items.

Evaluating the Impact of Verbal Multiword Expressions on Machine Translation

cs.CL · 2025-08-24 · conditional · novelty 3.0

Verbal multiword expressions reduce machine translation quality, with the degradation attributable to the expressions themselves rather than general sentence difficulty.

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

cs.CL · 2025-04-02 · unverdicted · novelty 3.0

A literature survey that organizes prompting, fine-tuning, preference optimization, and context-aware techniques for LLM-based machine translation with emphasis on low-resource languages.

citing papers explorer

Showing 17 of 17 citing papers.

On Stable Long-Form Generation: Benchmarking and Mitigating Length Volatility cs.CL · 2026-05-02 · unverdicted · none · ref 1
VOLTBench quantifies length volatility in LLM long-form generation; GLoBo, a logits-boosting decoder, increases mean length by 148% and cuts volatility by 69% while preserving quality.
What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features cs.CL · 2026-04-06 · unverdicted · none · ref 1
Effective multilingual reasoning in large models relies on language-specific patterns in reasoning features rather than uniform English-like traces.
MobiFlow: Real-World Mobile Agent Benchmarking through Trajectory Fusion cs.AI · 2026-02-28 · unverdicted · none · ref 1
MobiFlow is a new evaluation framework for mobile agents using trajectory fusion on 240 tasks across 20 third-party apps, achieving higher alignment with human judgments than prior benchmarks.
Bayesian Preference Learning for Test-Time Steerable Reward Models cs.LG · 2026-02-09 · unverdicted · none · ref 16
ICRM casts reward modeling as amortized variational inference over a latent preference probability with a Beta prior, enabling test-time adaptation to unseen preferences and improving benchmark performance.
Pseudo-Deliberation in Language Models: When Reasoning Fails to Align Values and Actions cs.CL · 2026-05-11 · unverdicted · none · ref 11
LLMs exhibit pseudo-deliberation, with consistent value-action misalignment in generated dialogues despite reasoning, as measured by the new VALDI framework across 4941 scenarios.
ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding cs.LG · 2026-04-16 · unverdicted · none · ref 6
ConfLayers dynamically skips LLM layers based on confidence scores to create adaptive draft models for self-speculative decoding, reporting up to 1.4x speedup over standard generation.
GRM: Utility-Aware Jailbreak Attacks on Audio LLMs via Gradient-Ratio Masking cs.SD · 2026-04-10 · unverdicted · none · ref 13
GRM ranks Mel bands by attack contribution versus utility sensitivity, perturbs a subset, and learns a universal perturbation to reach 88.46% average jailbreak success rate with improved attack-utility trade-off on four audio LLMs.
Learning to Stay Safe: Adaptive Regularization Against Safety Degradation during Fine-Tuning cs.CL · 2026-02-19 · unverdicted · none · ref 6
Adaptive regularization guided by training-time safety risk signals from judges or activations prevents safety degradation in fine-tuned language models while preserving utility.
Enhancing Table Reasoning with Deterministic Table-State Rewards cs.AI · 2026-01-30 · unverdicted · none · ref 19
RE-TAB uses a deterministic LCS-based table-state reward for stepwise guidance and test-time scaling, raising LLM table-reasoning accuracy by 26.7 pp on average across six backbones and three benchmarks.
Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values cs.CY · 2025-05-30 · unverdicted · none · ref 66
LLMs deviate from human moral preferences in kidney allocation scenarios and rarely express indecision, though low-rank fine-tuning with few examples can improve both consistency and uncertainty calibration.
Dynamic Scaled Gradient Descent for Stable Fine-Tuning for Classifications cs.LG · 2026-04-30 · unverdicted · none · ref 2
Dynamic scaled gradient descent prevents fine-tuning collapse by dynamically down-weighting gradients of correct examples, yielding lower performance variance and higher accuracy than standard methods on classification benchmarks.
A Reproducibility Study of LLM-Based Query Reformulation cs.IR · 2026-04-30 · unverdicted · none · ref 39
A unified evaluation finds LLM query reformulation gains are strongly conditioned on retrieval paradigm, do not consistently transfer to neural retrievers, and are not uniformly improved by larger LLMs.
CONSCIENTIA: Can LLM Agents Learn to Strategize? Emergent Deception and Trust in a Multi-Agent NYC Simulation cs.MA · 2026-04-10 · unverdicted · none · ref 1
LLM agents in an opposing-incentive NYC simulation develop limited selective trust and deception through KTO policy updates but stay 70% susceptible to adversarial persuasion.
A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction cs.CL · 2026-03-16 · conditional · none · ref 7
Heuristic demonstration selection methods outperform embedding-based methods for practical LLM-based next POI prediction on three real-world datasets.
AI Evaluation Should Require Standardized Item-Level Data Releases cs.AI · 2026-02-27 · conditional · none · ref 20 · 2 links
AI benchmark evaluations require standardized item-level data releases as core infrastructure to support validity assessment, demonstrated via the OpenEval archive of 10M responses across 155k items.
Evaluating the Impact of Verbal Multiword Expressions on Machine Translation cs.CL · 2025-08-24 · conditional · none · ref 5
Verbal multiword expressions reduce machine translation quality, with the degradation attributable to the expressions themselves rather than general sentence difficulty.
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation cs.CL · 2025-04-02 · unverdicted · none · ref 40
A literature survey that organizes prompting, fine-tuning, preference optimization, and context-aware techniques for LLM-based machine translation with emphasis on low-resource languages.

LLM tropes: Revealing fine-grained values and opinions in large language models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer