Pairwise scoring signals in Vision Transformer token reduction are inherently unstable due to high perturbation counts and degrade in deep layers, causing collapse, while unary signals with triage enable CATIS to retain 96.9% accuracy at 63% FLOPs reduction on ViT-Large ImageNet-1K.
hub
" * write output.state after.block = add.period write newline
78 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
claims ledger
- method It compares the model's prediction for the entire trajec- tory against the ground-truth trajectory label, Ltraj(s, a): Ltraj = LBCE Rϕ(s, a | x), Rtraj (12) where σ(·) denotes the sigmoid function, which converts the model's raw logit outputs into probabilities. LBCE(·, ·) denotes the BCE loss function. For a ground-truth label L ∈ { 0, 1} and a model logit output Rϕ, it is defined as LBCE(Rϕ, L) = −[L log σ(Rϕ) + (1− L) log(1 − σ(Rϕ))], By jointly optimizing this objective, Fin-PRM is train
co-cited works
roles
method 1polarities
use method 1representative citing papers
DTDR dynamically retrieves relevant tools by modeling dependencies from demonstrations and conditioning on the evolving agent plan, improving function calling success rates by 23-104% over static retrievers across benchmarks.
An incremental rank-lifting algorithm updates winning regions and policies in data-driven stochastic game abstractions by exploiting monotonic growth of under-approximations and shrinkage of over-approximations.
UWE is a task-agnostic bi-encoder that uses many-to-many InfoNCE and token-level soft late interaction to achieve zero-shot ranking across unseen work-related target spaces while using far fewer parameters than Qwen3-8B and improving MAP by 4.4 points.
MAS-Bench introduces 139 tasks, 88 predefined shortcuts, and 9 metrics to evaluate hybrid GUI-shortcut mobile agents, reporting up to 68.3% success and 39% efficiency gains over GUI-only baselines.
MECAT is a multi-expert benchmark for audio AI offering fine-grained captions and QA pairs generated via expert models and LLM reasoning, paired with the DATE metric that combines semantic similarity and cross-sample discriminability to favor detailed outputs.
VoteGCL augments graph-based recommendation systems with high-confidence synthetic interactions generated via majority-voting LLM reranks and integrates them into graph contrastive learning to improve accuracy and reduce popularity bias.
OSPO trains optimal order dispatch policies for homogeneous AV fleets using only one-step group rewards, outperforming GRPO on a real ride-hailing dataset.
TRAM is a test-time mixture method that scores and composes risk-neutral source policies using reward and occupancy-based risk to achieve new reward-risk tradeoffs without parameter updates.
Counterspeech reduces the likelihood that hate-speech-using newcomers continue posting in hate subreddits, though toxic counterspeech raises the chance of continued hostility in the thread.
MDSE attack uses dynamic multi-surrogate gradient estimation to create adversarial examples that simultaneously fool SNNs, ViTs, and CNNs, with reported gains up to 91.4% on ensembles and 3x on adversarially trained SNNs versus Auto-PGD.
Stronger reasoning models in LLMs reduce behavioral negotiation by defaulting to authority outcomes in multi-agent settings, unlike structured scaffolds that enable concessions.
Introduces a four-axis difficulty taxonomy integrated into an enterprise RAG benchmark to systematically diagnose multi-dimensional challenges like reasoning complexity and retrieval difficulty.
TeCQR retrieves related questions in cQA by generating tag-enhanced clarifying questions, using noise-tolerant semantic matching, and two-stage training to learn fine-grained representations of queries, questions, and tags.
STORM enables robust reference-conditioned 6D pose tracking from one image via hierarchical spatial fusion attention and a BCE-trained verifier that detects drift for automatic re-initialization.
Routing architecture for MLLMs enables continual learning with constant compute, matching multi-task learning performance and supporting cross-modal transfer.
Quantization of VLMs improves multiple reliability metrics beyond accuracy by damping high-rank spectral components and promoting reliance on robust low-rank features.
PeCL applies token-level dynamic differential privacy and privacy-guided memory sculpting to achieve superior privacy-utility balance in continual learning.
LoFT uses parameter-efficient fine-tuning of foundation models for long-tailed semi-supervised learning, supported by proofs that this reduces hypothesis complexity to minimize balanced posterior error and compresses outlier acceptance regions, with LoFT-OW handling open-world OOD cases.
Sparse autoencoders plus greedy filtering and factorization-machine interaction modeling identify minimal sets of features in Gemma-2-2B-IT and LLaMA-3.1-8B-IT whose ablation produces jailbreaks by flipping refusal to compliance.
TeRA parametrizes high-rank LLM weight updates via a random Tucker-like tensor network with shared frozen factors and layer-specific scaling vectors, matching high-rank adapter performance at vector-level parameter counts.
CogDriver-Agent with sparse temporal memory and spatiotemporal distillation on CogDriver-Data achieves 22% higher closed-loop Driving Score on Bench2Drive and 21% lower mean L2 error on nuScenes.
GSR jointly trains LLMs to generate candidate solutions and refine a superior final answer from them, achieving state-of-the-art performance on five mathematical benchmarks while transferring across model scales.
Fin-PRM is a domain-specialized process reward model that supplies binary step-level and trajectory-level supervision signals for financial reasoning in LLMs and outperforms general PRMs on CFLUE and FinQA benchmarks.