Canonical reference

Title resolution pending

Association for Computational Linguistics

Canonical reference. 73% of citing Pith papers cite this work as background.

56 Pith papers citing it

Background 73% of classified citations

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 7 baseline 2 dataset 2

citation-polarity summary

background 8 baseline 2 use dataset 1

representative citing papers

Security in LLM-as-a-Judge: A Comprehensive SoK

cs.CR · 2026-03-31 · accept · novelty 8.0

The first SoK on LLM-as-a-Judge security organizes attacks targeting judges, attacks using judges, defenses leveraging judges, and security-domain applications while flagging vulnerabilities.

AgroTools: A Benchmark for Tool-Augmented Multimodal Agents in Agriculture

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

AgroTools is a new benchmark for tool-augmented multimodal agents in agriculture featuring 539 QA pairs, 1,097 images, five task families, and 14 tools, with evaluations showing major limitations in current models' tool planning and execution.

CAREBench: Evaluating LLMs' Emotion Understanding by Assessing Cognitive Appraisal Reasoning

cs.AI · 2026-05-16 · unverdicted · novelty 7.0

CAREBench provides the first benchmark with full inferential chain annotations for appraisal reasoning and emotion understanding in LLMs, showing that stronger models still fall short on reasoning steps and capturing subjective human differences.

Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals

cs.CL · 2026-05-12 · unverdicted · novelty 7.0

Geometric consistency in embedding space predicts LLM-human disagreement on ordinal difficulty ratings better than probability baselines in CEFR sentence assessment.

XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity

cs.CL · 2026-05-07 · unverdicted · novelty 7.0

XL-SafetyBench is a new cross-cultural benchmark showing frontier LLMs decouple jailbreak robustness from cultural sensitivity while local models trade off attack success against neutral-safe rates in a near-linear pattern indicating generation failure rather than alignment.

Dependency-Aware Privacy for Multi-turn Agents

cs.CR · 2026-05-04 · unverdicted · novelty 7.0

RootGuard delivers turn-invariant privacy for multi-turn agents by noising root private attributes once and applying deterministic post-processing to all derived releases.

TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data

cs.AI · 2026-04-30 · unverdicted · novelty 7.0

TADI shows that domain-specialized tools orchestrated by an LLM over dual structured and semantic databases can convert heterogeneous wellsite data into evidence-grounded drilling intelligence, with tool design mattering more than model scale.

Mixture of Experts Framework in Machine Learning Interatomic Potentials for Atomistic Simulations

physics.comp-ph · 2026-04-28 · unverdicted · novelty 7.0

A co-trained multifidelity mixture-of-experts MLIP partitions simulations into high- and low-capacity regions, maintains exact energy conservation and bulk modulus alignment, and runs more than twice as fast as a single high-fidelity model on a Pt+CO system.

Training-Free Semantic Multi-Object Tracking with Vision-Language Models

cs.CV · 2026-04-15 · conditional · novelty 7.0

TF-SMOT composes pretrained vision-language models into a training-free pipeline that reaches state-of-the-art tracking and improved summary quality on the BenSMOT benchmark.

A-MBER: Affective Memory Benchmark for Emotion Recognition

cs.AI · 2026-04-08 · unverdicted · novelty 7.0

A-MBER is a new benchmark for evaluating AI models on using interaction history to recognize and explain a user's present affective state across judgment, retrieval, and explanation tasks.

Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

cs.LG · 2026-03-03 · unverdicted · novelty 7.0

GraphSSR introduces an adaptive SSR pipeline with SSR-SFT data synthesis and SSR-RL (Authenticity-Reinforced and Denoising-Reinforced stages) to overcome one-size-fits-all subgraph noise in zero-shot LLM graph reasoning.

Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning

cs.AI · 2026-01-14 · unverdicted · novelty 7.0

Omni-R1 unifies multimodal reasoning by generating intermediate images during the process in a SFT-plus-RL framework, with an Omni-R1-Zero variant that matches or exceeds it using only text data.

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

cs.CV · 2024-06-13 · conditional · novelty 7.0

MuirBench is a new benchmark showing that top multimodal LLMs struggle with robust multi-image understanding, with GPT-4o at 68% and open-source models below 33% accuracy.

Complete-muE: Optimal Hyperparameter Transfer and Scaling for MoE Models

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Complete-muE combines active-width μP and activated-expert scaling to transfer hyperparameters across dense FFN, dense MoE, and sparse MoE while covering changes in experts, capacity, width, depth, batch size, and duration.

PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

PEEK maintains a constant-sized context map via a programmable cache policy to give LLM agents persistent orientation knowledge about recurring external contexts, yielding 6-34% gains and lower cost than prior prompt-learning methods.

PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

PAI-2 improves factual correctness in LLM answers by 4% on average across benchmarks using adaptive graph traversal and planning, with 6% gains from traversal algorithms and 18% from enabled planning.

DiM\textsuperscript{3}: Bridging Multilingual and Multimodal Models via Direction- and Magnitude-Aware Merging

cs.CL · 2026-05-13 · conditional · novelty 6.0 · 2 refs

DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

Layer-wise Laplacian energy of visual attention reveals hallucination emergence in MLLMs and enables LaSCD, a closed-form logit remapping strategy that mitigates hallucinations while preserving general performance.

Uncovering Intra-expert Activation Sparsity for Efficient Mixture-of-Expert Model Execution

cs.LG · 2026-05-09 · conditional · novelty 6.0

Pre-trained MoE models exhibit up to 90% intra-expert activation sparsity that enables up to 2.5x faster MoE layer execution when exploited in the vLLM inference system.

On Privacy Leakage in Tabular Diffusion Models: Influential Factors, Attacker Knowledge, and Metrics

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Tabular diffusion models leak membership information via attacks even with partial attacker knowledge, and common heuristic privacy metrics like distance-to-closest-record are unreliable.

One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue

cs.CL · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

TurnGate identifies the critical turn in multi-turn dialogues where a response would complete hidden malicious intent, outperforming baselines on the new MTID dataset while keeping over-refusal low.

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

DTap is a new red-teaming platform for AI agents that uses autonomous exploration across realistic simulations to discover vulnerabilities and creates a verifiable benchmark dataset.

UAV as Urban Construction Change Monitor: A New Benchmark and Change Captioning Model

cs.CV · 2026-05-06 · unverdicted · novelty 6.0

PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.

Generating Statistical Charts with Validation-Driven LLM Workflows

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

A validation-driven LLM workflow generates 1,500 charts from 74 UCI datasets with 30,003 aligned QA pairs, revealing that current multimodal models handle chart syntax well but struggle with value extraction and reasoning.

citing papers explorer

Showing 50 of 56 citing papers.

Security in LLM-as-a-Judge: A Comprehensive SoK cs.CR · 2026-03-31 · accept · none · ref 37
The first SoK on LLM-as-a-Judge security organizes attacks targeting judges, attacks using judges, defenses leveraging judges, and security-domain applications while flagging vulnerabilities.
AgroTools: A Benchmark for Tool-Augmented Multimodal Agents in Agriculture cs.CV · 2026-05-21 · unverdicted · none · ref 24
AgroTools is a new benchmark for tool-augmented multimodal agents in agriculture featuring 539 QA pairs, 1,097 images, five task families, and 14 tools, with evaluations showing major limitations in current models' tool planning and execution.
CAREBench: Evaluating LLMs' Emotion Understanding by Assessing Cognitive Appraisal Reasoning cs.AI · 2026-05-16 · unverdicted · none · ref 17
CAREBench provides the first benchmark with full inferential chain annotations for appraisal reasoning and emotion understanding in LLMs, showing that stronger models still fall short on reasoning steps and capturing subjective human differences.
Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals cs.CL · 2026-05-12 · unverdicted · none · ref 10
Geometric consistency in embedding space predicts LLM-human disagreement on ordinal difficulty ratings better than probability baselines in CEFR sentence assessment.
XL-SafetyBench: A Country-Grounded Cross-Cultural Benchmark for LLM Safety and Cultural Sensitivity cs.CL · 2026-05-07 · unverdicted · none · ref 9
XL-SafetyBench is a new cross-cultural benchmark showing frontier LLMs decouple jailbreak robustness from cultural sensitivity while local models trade off attack success against neutral-safe rates in a near-linear pattern indicating generation failure rather than alignment.
Dependency-Aware Privacy for Multi-turn Agents cs.CR · 2026-05-04 · unverdicted · none · ref 21
RootGuard delivers turn-invariant privacy for multi-turn agents by noising root private attributes once and applying deterministic post-processing to all derived releases.
TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data cs.AI · 2026-04-30 · unverdicted · none · ref 12
TADI shows that domain-specialized tools orchestrated by an LLM over dual structured and semantic databases can convert heterogeneous wellsite data into evidence-grounded drilling intelligence, with tool design mattering more than model scale.
Mixture of Experts Framework in Machine Learning Interatomic Potentials for Atomistic Simulations physics.comp-ph · 2026-04-28 · unverdicted · none · ref 8
A co-trained multifidelity mixture-of-experts MLIP partitions simulations into high- and low-capacity regions, maintains exact energy conservation and bulk modulus alignment, and runs more than twice as fast as a single high-fidelity model on a Pt+CO system.
Training-Free Semantic Multi-Object Tracking with Vision-Language Models cs.CV · 2026-04-15 · conditional · none · ref 19
TF-SMOT composes pretrained vision-language models into a training-free pipeline that reaches state-of-the-art tracking and improved summary quality on the BenSMOT benchmark.
A-MBER: Affective Memory Benchmark for Emotion Recognition cs.AI · 2026-04-08 · unverdicted · none · ref 5
A-MBER is a new benchmark for evaluating AI models on using interaction history to recognize and explain a user's present affective state across judgment, retrieval, and explanation tasks.
Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models cs.LG · 2026-03-03 · unverdicted · none · ref 28
GraphSSR introduces an adaptive SSR pipeline with SSR-SFT data synthesis and SSR-RL (Authenticity-Reinforced and Denoising-Reinforced stages) to overcome one-size-fits-all subgraph noise in zero-shot LLM graph reasoning.
Omni-R1: Towards the Unified Generative Paradigm for Multimodal Reasoning cs.AI · 2026-01-14 · unverdicted · none · ref 29
Omni-R1 unifies multimodal reasoning by generating intermediate images during the process in a SFT-plus-RL framework, with an Omni-R1-Zero variant that matches or exceeds it using only text data.
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding cs.CV · 2024-06-13 · conditional · none · ref 47
MuirBench is a new benchmark showing that top multimodal LLMs struggle with robust multi-image understanding, with GPT-4o at 68% and open-source models below 33% accuracy.
Complete-muE: Optimal Hyperparameter Transfer and Scaling for MoE Models cs.LG · 2026-05-22 · unverdicted · none · ref 42
Complete-muE combines active-width μP and activated-expert scaling to transfer hyperparameters across dense FFN, dense MoE, and sparse MoE while covering changes in experts, capacity, width, depth, batch size, and duration.
PEEK: Context Map as an Orientation Cache for Long-Context LLM Agents cs.AI · 2026-05-19 · unverdicted · none · ref 7
PEEK maintains a constant-sized context map via a programmable cache policy to give LLM agents persistent orientation knowledge about recurring external contexts, yielding 6-34% gains and lower cost than prior prompt-learning methods.
PersonalAI 2.0: Enhancing knowledge graph traversal/retrieval with planning mechanism for Personalized LLM Agents cs.CL · 2026-05-13 · unverdicted · none · ref 20
PAI-2 improves factual correctness in LLM answers by 4% on average across benchmarks using adaptive graph traversal and planning, with 6% gains from traversal algorithms and 18% from enabled planning.
DiM\textsuperscript{3}: Bridging Multilingual and Multimodal Models via Direction- and Magnitude-Aware Merging cs.CL · 2026-05-13 · conditional · none · ref 13 · 2 links
DiM3 is a direction- and magnitude-aware merging method that composes heterogeneous multilingual and multimodal updates in LLM backbones, outperforming baselines on 57-language benchmarks while retaining multimodal performance.
When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs cs.CV · 2026-05-12 · unverdicted · none · ref 42
Layer-wise Laplacian energy of visual attention reveals hallucination emergence in MLLMs and enables LaSCD, a closed-form logit remapping strategy that mitigates hallucinations while preserving general performance.
Uncovering Intra-expert Activation Sparsity for Efficient Mixture-of-Expert Model Execution cs.LG · 2026-05-09 · conditional · none · ref 14
Pre-trained MoE models exhibit up to 90% intra-expert activation sparsity that enables up to 2.5x faster MoE layer execution when exploited in the vLLM inference system.
On Privacy Leakage in Tabular Diffusion Models: Influential Factors, Attacker Knowledge, and Metrics cs.LG · 2026-05-07 · unverdicted · none · ref 63
Tabular diffusion models leak membership information via attacks even with partial attacker knowledge, and common heuristic privacy metrics like distance-to-closest-record are unreliable.
One Turn Too Late: Response-Aware Defense Against Hidden Malicious Intent in Multi-Turn Dialogue cs.CL · 2026-05-07 · unverdicted · none · ref 30 · 2 links
TurnGate identifies the critical turn in multi-turn dialogues where a response would complete hidden malicious intent, outperforming baselines on the new MTID dataset while keeping over-refusal low.
DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents cs.AI · 2026-05-06 · unverdicted · none · ref 78
DTap is a new red-teaming platform for AI agents that uses autonomous exploration across realistic simulations to discover vulnerabilities and creates a verifiable benchmark dataset.
UAV as Urban Construction Change Monitor: A New Benchmark and Change Captioning Model cs.CV · 2026-05-06 · unverdicted · none · ref 57
PTNet is a prototype-guided task-adaptive model that jointly performs change detection and captioning on bi-temporal UAV imagery by modeling structured change semantics, outperforming prior methods on the new UCCD urban construction benchmark and WHU-CDC.
Generating Statistical Charts with Validation-Driven LLM Workflows cs.LG · 2026-05-01 · unverdicted · none · ref 24
A validation-driven LLM workflow generates 1,500 charts from 74 UCI datasets with 30,003 aligned QA pairs, revealing that current multimodal models handle chart syntax well but struggle with value extraction and reasoning.
Safety Drift After Fine-Tuning: Evidence from High-Stakes Domains cs.CY · 2026-04-27 · unverdicted · none · ref 9
Benign fine-tuning of foundation models induces large, heterogeneous, and often contradictory changes in safety metrics across general and domain-specific benchmarks.
Agentic Discovery with Active Hypothesis Exploration for Visual Recognition cs.CV · 2026-04-14 · unverdicted · none · ref 52
HypoExplore uses LLMs for hypothesis-driven evolutionary search with a Trajectory Tree and Hypothesis Memory Bank to discover lightweight vision architectures, reaching 94.11% accuracy on CIFAR-10 from an 18.91% baseline and generalizing to other datasets including state-of-the-art on MedMNIST.
Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy cs.CL · 2026-03-24 · conditional · none · ref 20
AI-generated text detectors achieve high benchmark accuracy by exploiting unstable dataset-specific linguistic features, as evidenced by cross-domain degradation and differing SHAP explanations across corpora.
How RL Unlocks the Aha Moment in Geometric Interleaved Reasoning cs.CL · 2026-03-01 · unverdicted · none · ref 37
Reinforcement learning with three causal constraints enables multimodal models to internalize diagram-reasoning links in geometry, unlike SFT which only mimics surface format and harms performance.
AgentGuard: A Multi-Agent Framework for Robust Package Confusion Detection via Hybrid Search and Metadata-Content Fusion cs.SE · 2026-01-29 · unverdicted · none · ref 40
AgentGuard detects package confusion attacks via multi-agent hybrid name search plus fused metadata-content ML analysis, raising precision 12-49% and cutting false positives 11-35% versus baselines on ConfuDB and NeupaneDB.
Cortex AISQL: A Production SQL Engine for Unstructured Data cs.DB · 2025-11-10 · unverdicted · none · ref 24
Snowflake's Cortex AISQL adds native semantic operations to SQL via AI-aware optimization, adaptive model cascades, and semantic join rewriting, delivering 2-70x speedups in production workloads.
Graph Concept Bottleneck Models cs.LG · 2025-08-19 · unverdicted · none · ref 6
GraphCBMs extend concept bottleneck models by building latent concept graphs to model correlations between concepts, yielding better image classification accuracy, more informative structure for interpretability, and stronger intervention results.
LLMs Get Lost In Multi-Turn Conversation cs.CL · 2025-05-09 · unverdicted · none · ref 74
LLMs drop 39% in performance during multi-turn conversations due to premature assumptions and inability to recover from early errors.
Long Context Transfer from Language to Vision cs.CV · 2024-06-24 · unverdicted · none · ref 41
Extending language model context length enables LMMs to process over 200K visual tokens from long videos without video training, achieving SOTA on Video-MME via dense frame sampling.
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned cs.CL · 2022-08-23 · accept · none · ref 61
RLHF-aligned language models show increasing resistance to red teaming with scale up to 52B parameters, unlike prompted or rejection-sampled models, supported by a released dataset of 38,961 attacks.
Cubit: Token Mixer with Kernel Ridge Regression cs.LG · 2026-05-07 · unverdicted · none · ref 43 · 2 links
Cubit replaces Transformer's attention with a closed-form Kernel Ridge Regression token mixer and reports larger gains as training sequence length increases.
BioResearcher: Scenario-Guided Multi-Agent for Translational Medicine cs.AI · 2026-05-07 · conditional · none · ref 9
BioResearcher is a new multi-agent system that leads baselines on single-step biomedical tests, BixBench, BaisBench, and a 30-query clinical discovery benchmark with 74.7% positive hit rate.
Proactive Dialogue Model with Intent Prediction cs.CL · 2026-04-30 · unverdicted · none · ref 2
A Temporal Bayesian Network derived from MultiWOZ intent annotations predicts user intent transitions and guides proactive dialogue generation, raising Coverage AUC from 0.742 to 0.856 while cutting turns to 75% coverage from 3.95 to 2.73.
Identifying and Mitigating Gender Cues in Academic Recommendation Letters: An Interpretability Case Study cs.LG · 2026-04-14 · unverdicted · none · ref 15
Transformer models detect applicant gender in de-gendered academic recommendation letters via implicit linguistic patterns such as associations with words like 'emotional' and 'humanitarian', and removing these cues reduces but does not eliminate prediction accuracy above chance.
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference cs.LG · 2026-04-08 · unverdicted · none · ref 2
Flux Attention uses a context-aware Layer Router to dynamically assign full or sparse attention to each LLM layer, achieving up to 2.8x prefill and 2.0x decode speedups with competitive performance on long-context and reasoning tasks.
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments cs.AI · 2026-03-25 · unverdicted · none · ref 97
An empirical literature analysis reveals a bifurcation in RL environments into Semantic Prior (LLM-dominated) and Domain-Specific Generalization ecosystems with distinct cognitive fingerprints.
Uncertainty Estimation for the Open-Set Text Classification systems cs.CL · 2026-03-17 · unverdicted · none · ref 12
Adapting HolUE to open-set text classification yields 40-365% gains in Prediction Rejection Ratio over baselines on authorship, intent, and topic datasets.
TeamPath: Building MultiModal Pathology Experts with Reasoning AI Copilots q-bio.QM · 2025-11-20 · unverdicted · none · ref 51
TeamPath introduces a reinforcement-learning-powered multimodal AI copilot for pathology that generates reasoned diagnoses and integrates image and transcriptomic data.
GIFT: Group-Relative Implicit Fine-Tuning Integrates GRPO with DPO and UNA cs.LG · 2025-10-27 · unverdicted · none · ref 18
GIFT matches the optimal policy of GRPO using an endogenous prompt-dependent KL coefficient derived via z-score standardization of implicit rewards.
MERIT: Modular Framework for Multimodal Misinformation Detection with Web-Grounded Reasoning cs.AI · 2025-10-20 · unverdicted · none · ref 12
MERIT achieves 81.65% F1 on MMFakeBench for multimodal misinformation detection via a four-module framework, outperforming zero-shot baselines like GPT-4V with MMD-Agent at 74.0% F1, with gains attributed to architectural design.
SLIP: Soft Label Mechanism and Key-Extraction-Guided CoT-based Defense Against Instruction Backdoor in APIs cs.CR · 2025-08-08 · unverdicted · none · ref 33
SLIP combines a soft label mechanism with key-extraction-guided CoT to reduce instruction backdoor attack success rate to 25.13% and raise clean accuracy to 87.15% in LLM agents.
TableMaster: A Recipe to Advance Table Understanding with Language Models cs.CL · 2025-01-31 · unverdicted · none · ref 57
TableMaster improves LM table understanding by verbalizing tables with enriched semantics and using adaptive textual-symbolic reasoning, reaching 78.13% accuracy on WikiTQ with GPT-4o-mini.
Causal Fine-Tuning under Latent Confounded Shift cs.LG · 2024-10-18 · unverdicted · none · ref 19
Causal Fine-Tuning decomposes BERT representations into causal and spurious parts via SCM inductive bias to improve robustness under latent confounded shifts in text classification.
ZAYA1-VL-8B Technical Report cs.CV · 2026-05-08 · unverdicted · none · ref 51
ZAYA1-VL-8B is a new MoE vision-language model with vision-specific LoRA adapters and bidirectional image attention that reports competitive performance against several 3B-4B models on image, reasoning, and counting benchmarks.
A Multi-Dimensional Audit of Politically Aligned Large Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 30
A multi-dimensional audit framework for politically aligned LLMs finds consistent trade-offs: larger models are more effective and truthful but less fair with higher bias, while fine-tuned models reduce bias but increase hallucinations and reasoning decline, and all tested models show deficiencies.
Detecting Alarming Student Verbal Responses using Text and Audio Classifier cs.CL · 2026-04-17 · unverdicted · none · ref 8
A hybrid text-plus-audio classifier framework is introduced to identify potentially troubling student responses by analyzing both what is said and how it is said.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer