Domain adaptation via synthetic manuscript images improves OMR performance on real-world piano manuscripts without requiring in-domain symbols.
hub Mixed citations
Warren, Lu Cheng, Haidar M
Mixed citation behavior. Most common role is background (56%).
hub tools
citation-role summary
citation-polarity summary
representative citing papers
PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.
Seven clinician-informed safety criteria enable LLM-as-a-Judge to reach substantial agreement with human consensus (Cohen's κ up to 0.75) on evaluating LLM responses to users demonstrating psychosis.
Forest proximities admit an exact sparse factorization via separable weighted leaf-collision kernels that reduces computation to sparse linear algebra over leaf collisions.
AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.
TabPATE applies a PATE-style private aggregation to synthetic tabular queries generated from feature ranges, enabling private in-context learning with near-random membership inference success while keeping competitive utility.
Cross-lingual prompt exploration improves factual recall and consistency in LLMs across 17 languages more efficiently than native-language scaling.
PIPER retrieves and ranks tabular datasets by profiling their content and using LLM-generated queries for dense vector search, outperforming metadata baselines and TableQA methods in low-metadata settings.
Macro uses DPO on composite preference pairs to raise validity of multilingual self-generated counterfactual explanations by 12.55% on average over chain-of-thought while preserving minimality.
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
Methods for constructing Hypergraphs of Text are proposed with a new effort ratio metric where TF-IDF baselines match LLM methods in experiments.
A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.
MONETA is the first multimodal benchmark for industry classification using text and geographic sources, with MLLM baselines at 62-74% accuracy and up to 22.8% gains from multi-turn context enrichment and explanations.
Fine-tuned LLMs produce plausible counterfactuals for health interventions and recover 20% F1 via data augmentation in label-scarce sensor datasets.
The paper introduces the InsideOut benchmark to quantify insider-outsider bias in LLM-generated interview scripts across 10 cultures and shows that multi-agent mitigation frameworks substantially reduce the bias on metrics like Cultural Alignment Gap.
LLM-based compression of financial source material can alter downstream investment decisions via decontextualization and model dependency, addressed by an agentic auditing approach that checks multiple compressions against the original.
Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.
NEURON integrates SNOMED CT, ML, and RAG LLM to raise AUC from 0.74-0.77 to 0.84-0.88 and human-aligned explainability scores from 0.50 to 0.85 on MIMIC-IV acute heart failure data.
Context-mediated domain adaptation treats user modifications to AI artifacts as implicit domain specifications that reshape LLM-powered multi-agent reasoning, demonstrated via the Seedentia system which extracted 46 domain knowledge entries from expert edits.
Weakly supervised ML classifier and hypothesis-testing signature mining detect LDAP reconnaissance at 65% TPR and 81.48% field precision.
Proposes an AI-driven synthetic data generation framework to create realistic cybersecurity datasets for smart city research where real data is scarce or sensitive.
This perspective paper calls for a research program treating LLMs as consequential social actors whose outputs influence human decisions, norms, and collective dynamics.
Copa is a theory-guided multimodal LLM agent that supports high school computational modeling through adaptive feedback, shown in a 33-dyad study to increase student confidence and conceptual verbalization without fostering dependence.
A transformer recommender system trained on a new benchmark of over 5,000 model performances from medical imaging papers achieves up to 75.5% HitRate@100.
citing papers explorer
-
Optical Music Recognition for Real-World Manuscripts with Synthetic Data
Domain adaptation via synthetic manuscript images improves OMR performance on real-world piano manuscripts without requiring in-domain symbols.
-
PackSELL: A Sparse Matrix Format for Precision-Agnostic High-Performance SpMV
PackSELL packs delta-encoded indices and values into single words with tunable bit allocation, delivering up to 1.63x faster FP16 SpMV and FP32-accurate performance exceeding FP16 cuSPARSE while reducing memory traffic.
-
AlphaEvolve: A coding agent for scientific and algorithmic discovery
AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, plus optimizations for Google data centers and hardware.
-
TabPATE: Differentially Private Tabular In-Context Learning Without Public Data
TabPATE applies a PATE-style private aggregation to synthetic tabular queries generated from feature ranges, enabling private in-context learning with near-random membership inference success while keeping competitive utility.
-
Cross-Lingual Exploration for Parametric Knowledge
Cross-lingual prompt exploration improves factual recall and consistency in LLMs across 17 languages more efficiently than native-language scaling.
-
PIPER: Content-Based Table Search via profiling and LLM-Generated Pseudoqueries
PIPER retrieves and ranks tabular datasets by profiling their content and using LLM-generated queries for dense vector search, outperforming metadata baselines and TableQA methods in low-metadata settings.
-
Macro: Enhancing Multilingual Counterfactual Explanations through Alignment-as-Preference Optimization
Macro uses DPO on composite preference pairs to raise validity of multilingual self-generated counterfactual explanations by 12.55% on average over chain-of-thought while preserving minimality.
-
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
-
Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text
Methods for constructing Hypergraphs of Text are proposed with a new effort ratio metric where TF-IDF baselines match LLM methods in experiments.
-
A Catalog of Data Errors
A new catalog classifying 35 data error types into missing, incorrect, and redundant categories for tabular data, with definitions and examples to improve data quality management.
-
MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems
MONETA is the first multimodal benchmark for industry classification using text and geographic sources, with MLLM baselines at 62-74% accuracy and up to 22.8% gains from multi-turn context enrichment and explanations.
-
When Summaries Distort Decisions: Information Fidelity in LLM-Compressed Financial Analysis
LLM-based compression of financial source material can alter downstream investment decisions via decontextualization and model dependency, addressed by an agentic auditing approach that checks multiple compressions against the original.
-
AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks
Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.
-
NEURON: A Neuro-symbolic System for Grounded Clinical Explainability
NEURON integrates SNOMED CT, ML, and RAG LLM to raise AUC from 0.74-0.77 to 0.84-0.88 and human-aligned explainability scores from 0.50 to 0.85 on MIMIC-IV acute heart failure data.
-
Context-Mediated Domain Adaptation in Multi-Agent Sensemaking Systems
Context-mediated domain adaptation treats user modifications to AI artifacts as implicit domain specifications that reshape LLM-powered multi-agent reasoning, demonstrated via the Seedentia system which extracted 46 domain knowledge entries from expert edits.
-
ML-Powered LDAP Reconnaissance Detection using Weak Supervision
Weakly supervised ML classifier and hypothesis-testing signature mining detect LDAP reconnaissance at 65% TPR and 81.48% field precision.
-
Bridging the Smart City Cybersecurity Data Gap Through AI-Driven Synthetic Dataset Generation
Proposes an AI-driven synthetic data generation framework to create realistic cybersecurity datasets for smart city research where real data is scarce or sensitive.
-
The social consequences of AI delegation
This perspective paper calls for a research program treating LLMs as consequential social actors whose outputs influence human decisions, norms, and collective dynamics.
-
A Theory-Guided LLM Pedagogical Agent for STEM+C Scaffolding Without Over-Reliance
Copa is a theory-guided multimodal LLM agent that supports high school computational modeling through adaptive feedback, shown in a 33-dyad study to increase student confidence and conceptual verbalization without fostering dependence.
-
MedicalRec: Medical recommender system for image classification without retraining
A transformer recommender system trained on a new benchmark of over 5,000 model performances from medical imaging papers achieves up to 75.5% HitRate@100.
-
Opportunities and Risks of Generative AI through the Health Information Journey
Authors propose a four-stage framework to analyze opportunities and risks of generative AI across the health information journey from public sources to clinical care.
-
Modality vs. Morphology: A Framework for Time Series Classification for Biological Signals
A review synthesizes evidence from EEG, EMG, ECG, PPG and ocular signals to argue that waveform morphology, rather than modality or model class, primarily determines TSC performance and interpretability.
-
Assessment of RAG and Fine-Tuning for Industrial Question-Answering-Applications
RAG is more effective and cost-efficient than fine-tuning for industrial QA adaptation on automotive datasets.
-
Towards Enabling An Artificial Self-Construction Software Life-cycle via Autopoietic Architectures
Proposes autopoietic architectures for self-constructing software as a fundamental shift in the SDLC, leveraging foundation models for autonomous evolution and maintenance.
-
"Skill issues'': data-centric optimization of lakehouse agents
Data-centric optimization of skills for agents on a branching lakehouse improves accuracy by 31.9% on 25 tasks via state-verification evaluation.