hub

" * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school serie

78 Pith papers cite this work. Polarity classification is still indexing.

78 Pith papers citing it

browse 78 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

method 1

citation-polarity summary

use method 1

claims ledger

method It compares the model's prediction for the entire trajec- tory against the ground-truth trajectory label, Ltraj(s, a): Ltraj = LBCE Rϕ(s, a | x), Rtraj (12) where σ(·) denotes the sigmoid function, which converts the model's raw logit outputs into probabilities. LBCE(·, ·) denotes the BCE loss function. For a ground-truth label L ∈ { 0, 1} and a model logit output Rϕ, it is defined as LBCE(Rϕ, L) = −[L log σ(Rϕ) + (1− L) log(1 − σ(Rϕ))], By jointly optimizing this objective, Fin-PRM is train

co-cited works

representative citing papers

Why Training-Free Token Reduction Collapses: The Inherent Instability of Pairwise Scoring Signals

cs.AI · 2026-04-17 · unverdicted · novelty 7.0

Pairwise scoring signals in Vision Transformer token reduction are inherently unstable due to high perturbation counts and degrade in deep layers, causing collapse, while unary signals with triage enable CATIS to retain 96.9% accuracy at 63% FLOPs reduction on ViT-Large ImageNet-1K.

Dynamic Tool Dependency Retrieval for Lightweight Function Calling

cs.LG · 2025-12-18 · unverdicted · novelty 7.0

DTDR dynamically retrieves relevant tools by modeling dependencies from demonstrations and conditioning on the evolving agent plan, improving function calling success rates by 23-104% over static retrievers across benchmarks.

Incremental Data-Driven Policy Synthesis via Game Abstractions

cs.GT · 2025-11-14 · unverdicted · novelty 7.0

An incremental rank-lifting algorithm updates winning regions and policies in data-driven stochastic game abstractions by exploiting monotonic growth of under-approximations and shrinkage of over-approximations.

Unified Work Embeddings: Contrastive Learning of a Bidirectional Multi-task Ranker

cs.CL · 2025-11-11 · unverdicted · novelty 7.0

UWE is a task-agnostic bi-encoder that uses many-to-many InfoNCE and token-level soft late interaction to achieve zero-shot ranking across unseen work-related target spaces while using far fewer parameters than Qwen3-8B and improving MAP by 4.4 points.

MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents

cs.AI · 2025-09-08 · conditional · novelty 7.0

MAS-Bench introduces 139 tasks, 88 predefined shortcuts, and 9 metrics to evaluate hybrid GUI-shortcut mobile agents, reporting up to 68.3% success and 39% efficiency gains over GUI-only baselines.

MECAT: A Multi-Experts Constructed Benchmark for Fine-Grained Audio Understanding Tasks

eess.AS · 2025-07-31 · unverdicted · novelty 7.0

MECAT is a multi-expert benchmark for audio AI offering fine-grained captions and QA pairs generated via expert models and LLM reasoning, paired with the DATE metric that combines semantic similarity and cross-sample discriminability to favor detailed outputs.

VoteGCL: Enhancing Graph-based Recommendations with Majority-Voting LLM-Rerank Augmentation

cs.IR · 2025-07-29 · unverdicted · novelty 7.0

VoteGCL augments graph-based recommendation systems with high-confidence synthetic interactions generated via majority-voting LLM reranks and integrates them into graph contrastive learning to improve accuracy and reduce popularity bias.

One Step is Enough: Multi-Agent Reinforcement Learning based on One-Step Policy Optimization for Order Dispatch on Ride-Sharing Platforms

cs.AI · 2025-07-21 · conditional · novelty 7.0

OSPO trains optimal order dispatch policies for homogeneous AV fleets using only one-step group rewards, outperforming GRPO on a real ride-hailing dataset.

TRAM: Test-Time Risk Adaptation with Mixture of Agents

cs.LG · 2024-08-16 · unverdicted · novelty 7.0

TRAM is a test-time mixture method that scores and composes risk-neutral source policies using reward and occupancy-based risk to achieve new reward-risk tradeoffs without parameter updates.

Assessing How Hate, Counterspeech, and Toxicity Affect Hate Group Newcomers

cs.CY · 2024-05-28 · unverdicted · novelty 7.0

Counterspeech reduces the likelihood that hate-speech-using newcomers continue posting in hate subreddits, though toxic counterspeech raises the chance of continued hostility in the thread.

Attacking the Spike: On the Transferability and Security of Spiking Neural Networks to Adversarial Examples

cs.NE · 2022-09-07 · unverdicted · novelty 7.0

MDSE attack uses dynamic multi-surrogate gradient estimation to create adversarial examples that simultaneously fool SNNs, ViTs, and CNNs, with reported gains up to 91.4% on ensembles and 3x on adversarially trained SNNs versus Auto-PGD.

When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

cs.LG · 2026-04-12 · unverdicted · novelty 6.0

Stronger reasoning models in LLMs reduce behavioral negotiation by defaulting to authority outcomes in multi-agent settings, unlike structured scaffolds that enable concessions.

Overcoming the "Impracticality" of RAG: Proposing a Real-World Benchmark and Multi-Dimensional Diagnostic Framework

cs.CL · 2026-04-03 · unverdicted · novelty 6.0

Introduces a four-axis difficulty taxonomy integrated into an enterprise RAG benchmark to systematically diagnose multi-dimensional challenges like reasoning complexity and retrieval difficulty.

Beyond Static: Related Questions Retrieval Through Conversations in Community Question Answering

cs.IR · 2026-03-09 · unverdicted · novelty 6.0

TeCQR retrieves related questions in cQA by generating tag-enhanced clarifying questions, using noise-tolerant semantic matching, and two-stage training to learn fine-grained representations of queries, questions, and tags.

STORM: Segment, Track, and Object Re-Localization from a Single Image

cs.CV · 2025-11-12 · unverdicted · novelty 6.0

STORM enables robust reference-conditioned 6D pose tracking from one image via hierarchical spatial fusion attention and a BCE-trained verifier that detects drift for automatic re-initialization.

Routing-Based Continual Learning for Multimodal Large Language Models

cs.LG · 2025-11-03 · unverdicted · novelty 6.0

Routing architecture for MLLMs enables continual learning with constant compute, matching multi-task learning performance and supporting cross-modal transfer.

Less Precise Can Be More Reliable: A Systematic Evaluation of Quantization's Impact on VLMs Beyond Accuracy

cs.CV · 2025-09-25 · unverdicted · novelty 6.0 · 2 refs

Quantization of VLMs improves multiple reliability metrics beyond accuracy by damping high-rank spectral components and promoting reliance on robust low-rank features.

Forget What's Sensitive, Remember What Matters: Token-Level Differential Privacy in Memory Sculpting for Continual Learning

cs.AI · 2025-09-16 · unverdicted · novelty 6.0

PeCL applies token-level dynamic differential privacy and privacy-guided memory sculpting to achieve superior privacy-utility balance in continual learning.

LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning in Open-World Scenarios

cs.LG · 2025-09-12 · unverdicted · novelty 6.0

LoFT uses parameter-efficient fine-tuning of foundation models for long-tailed semi-supervised learning, supported by proofs that this reduces hypothesis complexity to minimize balanced posterior error and compresses outlier acceptance regions, with LoFT-OW handling open-world OOD cases.

Beyond I'm Sorry, I Can't: Dissecting Large Language Model Refusal

cs.CL · 2025-09-07 · unverdicted · novelty 6.0

Sparse autoencoders plus greedy filtering and factorization-machine interaction modeling identify minimal sets of features in Gemma-2-2B-IT and LLaMA-3.1-8B-IT whose ablation produces jailbreaks by flipping refusal to compliance.

TeRA: Vector-based Random Tensor Network for High-Rank Adaptation of Large Language Models

cs.LG · 2025-09-03 · unverdicted · novelty 6.0

TeRA parametrizes high-rank LLM weight updates via a random Tucker-like tensor network with shared frozen factors and layer-specific scaling vectors, matching high-rank adapter performance at vector-level parameter counts.

CogDriver: Integrating Cognitive Inertia for Temporally Coherent Planning in Autonomous Driving

cs.CV · 2025-08-31 · unverdicted · novelty 6.0

CogDriver-Agent with sparse temporal memory and spatiotemporal distillation on CogDriver-Data achieves 22% higher closed-loop Driving Score on Bench2Drive and 21% lower mean L2 error on nuScenes.

Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs

cs.LG · 2025-08-27 · conditional · novelty 6.0

GSR jointly trains LLMs to generate candidate solutions and refine a superior final answer from them, achieving state-of-the-art performance on five mathematical benchmarks while transferring across model scales.

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

cs.CL · 2025-08-21 · unverdicted · novelty 6.0

Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.

citing papers explorer

Showing 2 of 2 citing papers after filters.

From Next Token Prediction to (STRIPS) World Models cs.AI · 2025-09-16 · unreviewed · ref 22
The Ratchet Effect in Silico: How Interaction Drives Cumulative Intelligence in Large Language Models cs.LG · 2025-07-25 · unreviewed · ref 1

" * write output.state after.block = add.period write newline

hub tools

citation-role summary

citation-polarity summary

claims ledger

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer