Mixed citations

Warren, Lu Cheng, Haidar M

Man Luo, Christopher J · 2024 · arXiv 2323.2024

Mixed citation behavior. Most common role is background (62%).

20 Pith papers citing it

Background 62% of classified citations

read on arXiv browse 20 citing papers

citation-role summary

background 6 extension 1 method 1

citation-polarity summary

background 5 extend 1 support 1 use method 1

representative citing papers

PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV

cs.DC · 2026-04-15 · unverdicted · novelty 7.0

PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

cs.CL · 2026-03-20 · conditional · novelty 7.0

Seven clinician-informed safety criteria enable LLM-as-a-Judge to reach substantial agreement with human consensus (Cohen's κ up to 0.75) on evaluating LLM responses to users demonstrating psychosis.

Revisiting Forest Proximities via Sparse Leaf-Incidence Kernels

cs.LG · 2026-01-06 · conditional · novelty 7.0

Forest proximities admit an exact sparse factorization via separable weighted leaf-collision kernels that reduces computation to sparse linear algebra over leaf collisions.

AlphaEvolve: A coding agent for scientific and algorithmic discovery

cs.AI · 2025-06-16 · unverdicted · novelty 7.0

AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.

PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries

cs.IR · 2026-05-18 · unverdicted · novelty 6.0

PIPER retrieves and ranks tabular datasets by profiling their content and using LLM-generated queries for dense vector search, outperforming metadata baselines and TableQA methods in low-metadata settings.

Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

Macro uses Direct Preference Optimization on composite-scored preference pairs to improve validity of multilingual self-generated counterfactual explanations by 12.55% on average without degrading minimality.

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.

NEURON: A Neuro-symbolic System for Grounded Clinical Explainability

cs.AI · 2026-05-02 · unverdicted · novelty 6.0

NEURON raises AUC from 0.74-0.77 to 0.84-0.88 on MIMIC-IV heart-failure mortality prediction while lifting human-aligned explanation scores from 0.50 to 0.85 by grounding SHAP values in SNOMED CT and patient notes via RAG-LLM.

Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text

cs.IR · 2026-04-28 · unverdicted · novelty 6.0

Methods for constructing Hypergraphs of Text are proposed with a new effort ratio metric where TF-IDF baselines match LLM methods in experiments.

A Catalog of Data Errors

cs.DB · 2026-04-10 · unverdicted · novelty 6.0

A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.

MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems

cs.AI · 2026-04-09 · unverdicted · novelty 6.0

MONETA is the first multimodal benchmark for industry classification using text and geographic sources, with MLLM baselines at 62-74% accuracy and up to 22.8% gains from multi-turn context enrichment and explanations.

Counterfactual Modeling with Fine-Tuned LLMs for Health Intervention Design and Sensor Data Augmentation

cs.LG · 2026-01-21 · conditional · novelty 6.0

Fine-tuned LLMs produce plausible counterfactuals for health interventions and recover 20% F1 via data augmentation in label-scarce sensor datasets.

InsideOut: Measuring and Mitigating Insider-Outsider Bias in Interview Script Generation

cs.CL · 2025-09-25 · conditional · novelty 6.0

The paper introduces the InsideOut benchmark to quantify insider-outsider bias in LLM-generated interview scripts across 10 cultures and shows that multi-agent mitigation frameworks substantially reduce the bias on metrics like Cultural Alignment Gap.

AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.

Context-Mediated Domain Adaptation in Multi-Agent Sensemaking Systems

cs.HC · 2026-03-25 · unverdicted · novelty 5.0

Context-mediated domain adaptation treats user modifications to AI artifacts as implicit domain specifications that reshape LLM-powered multi-agent reasoning, demonstrated via the Seedentia system which extracted 46 domain knowledge entries from expert edits.

Opportunities and Risks of Generative AI through the Health Information Journey

cs.CY · 2026-05-21 · unverdicted · novelty 4.0

Authors propose a four-stage framework to analyze opportunities and risks of generative AI across the health information journey from public sources to clinical care.

Modality vs. Morphology: A Framework for Time Series Classification for Biological Signals

cs.LG · 2026-05-18 · unverdicted · novelty 4.0

A review synthesizes evidence from EEG, EMG, ECG, PPG and ocular signals to argue that waveform morphology, rather than modality or model class, primarily determines TSC performance and interpretability.

Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications

cs.CL · 2026-05-10 · unverdicted · novelty 4.0

RAG is more effective and cost-efficient than fine-tuning for industrial QA adaptation on automotive datasets.

Towards Enabling An Artificial Self-Construction Software Life-cycle via Autopoietic Architectures

cs.SE · 2026-04-15 · unverdicted · novelty 4.0

Proposes autopoietic architectures for self-constructing software as a fundamental shift in the SDLC, leveraging foundation models for autonomous evolution and maintenance.

LLM Harms: A Taxonomy and Discussion

cs.CY · 2025-12-05

citing papers explorer

Showing 20 of 20 citing papers.

PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV cs.DC · 2026-04-15 · unverdicted · none · ref 40
PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.
Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis cs.CL · 2026-03-20 · conditional · none · ref 17
Seven clinician-informed safety criteria enable LLM-as-a-Judge to reach substantial agreement with human consensus (Cohen's κ up to 0.75) on evaluating LLM responses to users demonstrating psychosis.
Revisiting Forest Proximities via Sparse Leaf-Incidence Kernels cs.LG · 2026-01-06 · conditional · none · ref 5
Forest proximities admit an exact sparse factorization via separable weighted leaf-collision kernels that reduces computation to sparse linear algebra over leaf collisions.
AlphaEvolve: A coding agent for scientific and algorithmic discovery cs.AI · 2025-06-16 · unverdicted · none · ref 38
AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.
PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries cs.IR · 2026-05-18 · unverdicted · none · ref 17
PIPER retrieves and ranks tabular datasets by profiling their content and using LLM-generated queries for dense vector search, outperforming metadata baselines and TableQA methods in low-metadata settings.
Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization cs.CL · 2026-05-12 · unverdicted · none · ref 5
Macro uses Direct Preference Optimization on composite-scored preference pairs to improve validity of multilingual self-generated counterfactual explanations by 12.55% on average without degrading minimality.
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization cs.SE · 2026-05-08 · unverdicted · none · ref 46
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
NEURON: A Neuro-symbolic System for Grounded Clinical Explainability cs.AI · 2026-05-02 · unverdicted · none · ref 104
NEURON raises AUC from 0.74-0.77 to 0.84-0.88 on MIMIC-IV heart-failure mortality prediction while lifting human-aligned explanation scores from 0.50 to 0.85 by grounding SHAP values in SNOMED CT and patient notes via RAG-LLM.
Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text cs.IR · 2026-04-28 · unverdicted · none · ref 5
Methods for constructing Hypergraphs of Text are proposed with a new effort ratio metric where TF-IDF baselines match LLM methods in experiments.
A Catalog of Data Errors cs.DB · 2026-04-10 · unverdicted · none · ref 125
A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.
MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems cs.AI · 2026-04-09 · unverdicted · none · ref 42
MONETA is the first multimodal benchmark for industry classification using text and geographic sources, with MLLM baselines at 62-74% accuracy and up to 22.8% gains from multi-turn context enrichment and explanations.
Counterfactual Modeling with Fine-Tuned LLMs for Health Intervention Design and Sensor Data Augmentation cs.LG · 2026-01-21 · conditional · none · ref 10
Fine-tuned LLMs produce plausible counterfactuals for health interventions and recover 20% F1 via data augmentation in label-scarce sensor datasets.
InsideOut: Measuring and Mitigating Insider-Outsider Bias in Interview Script Generation cs.CL · 2025-09-25 · conditional · none · ref 2
The paper introduces the InsideOut benchmark to quantify insider-outsider bias in LLM-generated interview scripts across 10 cultures and shows that multi-agent mitigation frameworks substantially reduce the bias on metrics like Cultural Alignment Gap.
AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks cs.AI · 2026-05-11 · unverdicted · none · ref 56
Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.
Context-Mediated Domain Adaptation in Multi-Agent Sensemaking Systems cs.HC · 2026-03-25 · unverdicted · none · ref 50
Context-mediated domain adaptation treats user modifications to AI artifacts as implicit domain specifications that reshape LLM-powered multi-agent reasoning, demonstrated via the Seedentia system which extracted 46 domain knowledge entries from expert edits.
Opportunities and Risks of Generative AI through the Health Information Journey cs.CY · 2026-05-21 · unverdicted · none · ref 92
Authors propose a four-stage framework to analyze opportunities and risks of generative AI across the health information journey from public sources to clinical care.
Modality vs. Morphology: A Framework for Time Series Classification for Biological Signals cs.LG · 2026-05-18 · unverdicted · none · ref 79
A review synthesizes evidence from EEG, EMG, ECG, PPG and ocular signals to argue that waveform morphology, rather than modality or model class, primarily determines TSC performance and interpretability.
Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications cs.CL · 2026-05-10 · unverdicted · none · ref 2
RAG is more effective and cost-efficient than fine-tuning for industrial QA adaptation on automotive datasets.
Towards Enabling An Artificial Self-Construction Software Life-cycle via Autopoietic Architectures cs.SE · 2026-04-15 · unverdicted · none · ref 43
Proposes autopoietic architectures for self-constructing software as a fundamental shift in the SDLC, leveraging foundation models for autonomous evolution and maintenance.
LLM Harms: A Taxonomy and Discussion cs.CY · 2025-12-05 · unreviewed · ref 125

Warren, Lu Cheng, Haidar M

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer