A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
[RPG+21] Ron Ross, Victoria Pillitteri, Richard Graubart, Deborah Bodeau, and Rosalie Mcquaid
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
LLM approaches ExArch and ArTEMiS reach F1 scores of 0.86 and 0.81 for architecture entity recognition and traceability, matching or approaching baselines that require manual models.
A framework repairs CPS requirements in Simulink by leveraging system execution data and is evaluated as effective on six real-world case studies covering 12 requirements.
LLMs can detect usability content in user reviews with F-scores comparable to humans, though performance depends strongly on prompt design.
T-SHAP stabilizes SHAP attributions temporally for LSTM fall detection, achieving 94.3% accuracy and improved faithfulness on NTU RGB+D dataset.
A zero-sum game model with algebraic dependency filtering selects budget-constrained security control sets from standardized catalogues and is demonstrated on a fictional military system using ITSG-33.
citing papers explorer
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
Who's Who? LLM-assisted Software Traceability with Architecture Entity Recognition
LLM approaches ExArch and ArTEMiS reach F1 scores of 0.86 and 0.81 for architecture entity recognition and traceability, matching or approaching baselines that require manual models.
-
Automated Repair of Requirements for Cyber-Physical Systems in Simulink Requirements Tables
A framework repairs CPS requirements in Simulink by leveraging system execution data and is evaluated as effective on six real-world case studies covering 12 requirements.
-
Explainable Fall Detection for Elderly Monitoring via Temporally Stable SHAP in Skeleton-Based Human Activity Recognition
T-SHAP stabilizes SHAP attributions temporally for LSTM fall detection, achieving 94.3% accuracy and improved faithfulness on NTU RGB+D dataset.
-
A Scalable Game-Theoretic Approach for Selecting Security Controls from Standardized Catalogues
A zero-sum game model with algebraic dependency filtering selects budget-constrained security control sets from standardized catalogues and is demonstrated on a fictional military system using ITSG-33.