RMCT matches the rate of target behaviors like bias-following across input perturbations to reduce sycophancy in LLMs while preserving verbalization of bias cues.
A categorical archive of chatgpt failures
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
U-Define improves user control in LLM planning by letting people define hard rules and soft preferences in natural language with matching verification methods, raising usefulness and satisfaction scores.
Prompt framing significantly shifts LLM choices toward risk-averse options in a threshold voting task even when the prompts are logically equivalent.
LLM code generation lacks syntactic robustness on math-formula prompts, but formula-reduction pre-processing raises it from 54.05% to 74.42%.
TrustLLM defines eight trustworthiness principles, creates a six-dimension benchmark, and evaluates 16 LLMs showing proprietary models generally lead but some open-source ones are close while over-calibration can hurt utility.
Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.
ChatGPT often generates code vulnerable to attacks even when prompted to produce secure code.
LLM graders achieve substantial human agreement on math and science MCAS items but vary on ELA, performing best as sources of formative narrative feedback rather than summative numerical scores.
AI and NLP applied to educational artifacts within the Instructional Core Framework can identify advantages for teacher coaching, student support, and personalized learning.
citing papers explorer
No citing papers match the current filters.