OSA improves LLM-based recommenders by anchoring ordinal preference levels as numeric tokens in the model's latent space to retain fine-grained strength information when fusing collaborative signals.
Mixed citations
Title resolution pending
Mixed citation behavior. Most common role is method (60%).
citation-role summary
citation-polarity summary
representative citing papers
SMTPO uses multi-task SFT to improve simulator feedback quality and RL with fine-grained rewards to optimize multi-turn preference reasoning in LLM-based conversational recommendation.
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
EvoESAP uses evolutionary search guided by a speculative-decoding-inspired ESAP metric to discover non-uniform layer-wise sparsity allocations for MoE expert pruning, improving generation accuracy up to 19.6% at 50% sparsity.
ProAgent uses on-demand tiered perception and context-aware LLM reasoning to deliver proactive assistance on AR glasses, achieving up to 27.7% higher prediction accuracy and 20.5% lower false detections than baselines.
ScanVLA uses a vision-language model with a history-enhanced decoder and frozen segmentation LoRA to outperform prior methods on object-referring scanpath prediction.
LWGR applies personalized soft instructions for LLM knowledge extraction and Lagrangian primal-dual optimization to selectively fuse beneficial world knowledge into generative recommendation while bounding degradation.
SpotSound adds a hallucination-suppressing objective and a needle-in-haystack benchmark to audio-language models, reaching state-of-the-art temporal grounding while keeping general task performance.
UniDetect is an LLM-based system that generates universal transaction summary texts and uses two-stage multimodal training on text plus graphs to detect fraudulent accounts across heterogeneous blockchains, outperforming baselines by 5.57-7.58% KS and achieving over 94.58% zero-shot cross-chain and
ARHN refines hard-negative training data for dense retrieval by using LLMs to convert answer-containing passages into additional positives and exclude answer-containing passages from the negative set.
KnowSA_CKP uses comparative knowledge probing to selectively augment LLM prompts for items with knowledge gaps, improving recommendation accuracy and context efficiency.
UATTA adapts pre-trained text-image models at test time without labels by using disagreement in bidirectional retrieval rankings to estimate and mitigate uncertainty for improved person search.
CRAB mitigates popularity bias in generative recommenders by rebalancing the semantic token codebook through splitting popular tokens and applying a tree-structured regularizer to boost representations for unpopular items.
CKG-LLM uses LLMs to generate executable queries over contract knowledge graphs for detecting access control vulnerabilities and reports superior performance versus existing tools.
MAP4TS combines global, local, statistical, and temporal prompts derived from classical time-series analysis with raw embeddings via cross-modality alignment to improve LLM forecasting performance across eight datasets.
Multi-stage LLM training plus compiler-guided error repair boosts functional equivalence in Java-to-Cangjie translation by 6.06% over prior methods despite scarce parallel data.
DPPMG learns discrete modal-specific preferences via a dedicated GNN from multimodal user data, quantizes them into tokens, and feeds them into generators with a consistency reward to produce personalized text and images.
The UPDP pipeline filters privacy terms and generates de-identified radiology images that preserve diagnostic pathology information, enabling models with competitive disease detection accuracy but reduced identity leakage and improved cross-hospital performance.
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
SDA uses structural alignment as a soft teacher and gated low-rank expert paths to adapt LVLMs for multimodal recommendation, reporting 6.15% Hit@10 and 8.64% NDCG@10 average gains plus larger long-tail improvements on Amazon datasets.
SHE is a new RL framework using stepwise hybrid examination rewards to improve reasoning quality and accuracy in large-scale e-commerce query-product relevance prediction.
OpsLLM is a domain-specific LLM for software ops QA and RCA built with human-curated data, SFT, and RL using a domain process reward model, showing accuracy gains of 0.2-5.7% on QA and 2.7-70.3% on RCA over general LLMs.
A pipeline of dataset construction from prior work, AugFC parameter augmentation, and two-step LLM training improves function calling for financial APIs and is running in production.
citing papers explorer
-
Every Preference Has Its Strength: Injecting Ordinal Semantics into LLM-Based Recommenders
OSA improves LLM-based recommenders by anchoring ordinal preference levels as numeric tokens in the model's latent space to retain fine-grained strength information when fusing collaborative signals.
-
User Simulator-Guided Multi-Turn Preference Optimization for Reasoning LLM-based Conversational Recommendation
SMTPO uses multi-task SFT to improve simulator feedback quality and RL with fine-grained rewards to optimize multi-turn preference reasoning in LLM-based conversational recommendation.
-
HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness
HeiSD delivers up to 2.45x faster inference for embodied VLA models by hybridizing speculative decoding with kinematic boundary detection and error-mitigation tricks while preserving task success rates.
-
EvoESAP: Non-Uniform Expert Pruning for Sparse MoE
EvoESAP uses evolutionary search guided by a speculative-decoding-inspired ESAP metric to discover non-uniform layer-wise sparsity allocations for MoE expert pruning, improving generation accuracy up to 19.6% at 50% sparsity.
-
ProAgent: Harnessing On-Demand Sensory Contexts for Proactive LLM Agent Systems in the Wild
ProAgent uses on-demand tiered perception and context-aware LLM reasoning to deliver proactive assistance on AR glasses, achieving up to 27.7% higher prediction accuracy and 20.5% lower false detections than baselines.
-
Object Referring-Guided Scanpath Prediction with Perception-Enhanced Vision-Language Models
ScanVLA uses a vision-language model with a history-enhanced decoder and frozen segmentation LoRA to outperform prior methods on object-referring scanpath prediction.
-
LWGR: Lagrangian-Constrained Personalized World Knowledge for Generative Recommendation
LWGR applies personalized soft instructions for LLM knowledge extraction and Lagrangian primal-dual optimization to selectively fuse beneficial world knowledge into generative recommendation while bounding degradation.
-
SpotSound: Enhancing Large Audio-Language Models with Fine-Grained Temporal Grounding
SpotSound adds a hallucination-suppressing objective and a needle-in-haystack benchmark to audio-language models, reaching state-of-the-art temporal grounding while keeping general task performance.
-
UniDetect: LLM-Driven Universal Fraud Detection across Heterogeneous Blockchains
UniDetect is an LLM-based system that generates universal transaction summary texts and uses two-stage multimodal training on text plus graphs to detect fraudulent accounts across heterogeneous blockchains, outperforming baselines by 5.57-7.58% KS and achieving over 94.58% zero-shot cross-chain and
-
ARHN: Answer-Centric Relabeling of Hard Negatives with Open-Source LLMs for Dense Retrieval
ARHN refines hard-negative training data for dense retrieval by using LLMs to convert answer-containing passages into additional positives and exclude answer-containing passages from the negative set.
-
Filling the Gaps: Selective Knowledge Augmentation for LLM Recommenders
KnowSA_CKP uses comparative knowledge probing to selectively augment LLM prompts for items with knowledge gaps, improving recommendation accuracy and context efficiency.
-
Pretrain-then-Adapt: Uncertainty-Aware Test-Time Adaptation for Text-based Person Search
UATTA adapts pre-trained text-image models at test time without labels by using disagreement in bidirectional retrieval rankings to estimate and mitigate uncertainty for improved person search.
-
CRAB: Codebook Rebalancing for Bias Mitigation in Generative Recommendation
CRAB mitigates popularity bias in generative recommenders by rebalancing the semantic token codebook through splitting popular tokens and applying a tree-structured regularizer to boost representations for unpopular items.
-
CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs
CKG-LLM uses LLMs to generate executable queries over contract knowledge graphs for detecting access control vulnerabilities and reports superior performance versus existing tools.
-
MAP4TS: A Multi-Aspect Prompting Framework for Time-Series Forecasting with Large Language Models
MAP4TS combines global, local, statistical, and temporal prompts derived from classical time-series analysis with raw embeddings via cross-modality alignment to improve LLM forecasting performance across eight datasets.
-
Boosting Automatic Java-to-Cangjie Translation with Multi-Stage LLM Training and Error Repair
Multi-stage LLM training plus compiler-guided error repair boosts functional equivalence in Java-to-Cangjie translation by 6.06% over prior methods despite scarce parallel data.
-
Discrete Preference Learning for Personalized Multimodal Generation
DPPMG learns discrete modal-specific preferences via a dedicated GNN from multimodal user data, quantizes them into tokens, and feeds them into generators with a consistency reward to produce personalized text and images.
-
A Utility-preserving De-identification Pipeline for Cross-hospital Radiology Data Sharing
The UPDP pipeline filters privacy terms and generates de-identified radiology images that preserve diagnostic pathology information, enabling models with competitive disease detection accuracy but reduced identity leakage and improved cross-hospital performance.
-
WRF4CIR: Weight-Regularized Fine-Tuning Network for Composed Image Retrieval
WRF4CIR uses weight-regularized fine-tuning with adversarial perturbations to mitigate overfitting in composed image retrieval and narrows the generalization gap on benchmarks.
-
Structural and Disentangled Adaptation of Large Vision Language Models for Multimodal Recommendation
SDA uses structural alignment as a soft teacher and gated low-rank expert paths to adapt LVLMs for multimodal recommendation, reporting 6.15% Hit@10 and 8.64% NDCG@10 average gains plus larger long-tail improvements on Amazon datasets.
-
SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance
SHE is a new RL framework using stepwise hybrid examination rewards to improve reasoning quality and accuracy in large-scale e-commerce query-product relevance prediction.
-
An End-to-End Framework for Building Large Language Models for Software Operations
OpsLLM is a domain-specific LLM for software ops QA and RCA built with human-curated data, SFT, and RL using a domain process reward model, showing accuracy gains of 0.2-5.7% on QA and 2.7-70.3% on RCA over general LLMs.
-
Data-Driven Function Calling Improvements in Large Language Model for Online Financial QA
A pipeline of dataset construction from prior work, AugFC parameter augmentation, and two-step LLM training improves function calling for financial APIs and is running in production.