hub

Attention is all you need

· 2017

18 Pith papers cite this work. Polarity classification is still indexing.

18 Pith papers citing it

browse 18 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Toward Generalizable Forgery Detection and Reasoning

cs.CV · 2025-03-27 · unverdicted · novelty 7.0

FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.

Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation

cs.LG · 2026-03-29 · unverdicted · novelty 6.0

DBR-AF decouples cross-variable correlations in reconstruction and applies autoregressive flows to model residual densities for improved anomaly detection in multivariate time series.

Versatile yet Efficient Network Traffic Analysis: Offloading Network Foundation Model to SmartNIC

cs.NI · 2025-08-04 · unverdicted · novelty 6.0

Nepco offloads network foundation models to SmartNICs using localized byte-sequence modeling and a pattern-aware convolutional architecture to achieve competitive macro F1 scores with 328x lower end-to-end latency than prior foundation models.

AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model

cs.LG · 2024-08-01 · unverdicted · novelty 6.0

AutoPV applies neural architecture search with a custom search space drawn from time series forecasting and photovoltaic models to automatically produce architectures that outperform predefined state-of-the-art models on a Chinese solar station dataset.

Robust Adaptation of Foundation Models with Black-Box Visual Prompting

cs.CV · 2024-07-04 · unverdicted · novelty 6.0

BlackVIP adapts foundation models via a Coordinator for input-dependent visual prompts and SPSA-GC for gradient estimation, enabling robust transfer on 19 datasets with low memory use and a link to randomized smoothing robustness.

Attention-Based Deep Reinforcement Learning for Qubit Allocation in Modular Quantum Architectures

quant-ph · 2024-06-17 · unverdicted · novelty 6.0

An attention-based DRL agent with Transformer encoder and GNN learns heuristics for qubit-to-core allocation in multi-core quantum systems to minimize state transfers and online compilation time.

Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration

cs.CV · 2023-12-31 · unverdicted · novelty 6.0

Diff-PCR uses a diffusion model to learn denoising directions for refining doubly stochastic correspondence matrices, improving point cloud registration over one-shot normalization methods.

netFound: Principled Design for Network Foundation Models

cs.NI · 2023-10-25 · unverdicted · novelty 6.0

netFound is a pretrained network foundation model using protocol-aware tokenization, context embedding, hierarchical attention, and privacy design that reaches F1 0.95 on exogenous context discrimination versus under 0.62 for prior models.

Learning Project-wise Subsequent Code Edits via Interleaving Neural-based Induction and Tool-based Deduction

cs.SE · 2026-04-14 · unverdicted · novelty 5.0

TRACE improves project-wise subsequent code editing by interleaving neural-based induction for semantic edits and tool-based deduction for syntactic edits.

SMFD-UNet: Semantic Face Mask Is The Only Thing You Need To Deblur Faces

cs.CV · 2026-04-08 · unverdicted · novelty 5.0

SMFD-UNet deblurs faces by generating semantic component masks from blurry inputs and fusing them via multi-stage UNet with residual dense blocks and attention, reporting higher PSNR and SSIM on CelebA than prior models.

RouteFormer: A Transformer-Based Routing Framework for Autonomous Vehicles

cs.RO · 2025-04-07 · unverdicted · novelty 5.0

RouteFormer is a transformer-RL hybrid for single-agent graph routing that reports 10% and 7% shorter distances than Concorde and LKH-3 on mission-like graphs by incorporating constraints the solvers ignore.

Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning

cs.RO · 2024-12-27 · unverdicted · novelty 4.0

A centralized HRL planner with HTAN, multi-stage curricula, and counterfactual baseline scales multi-robot task planning to 200 robots and 1000 racks on unlearned maps in RMFS.

Flemme: A Flexible and Modular Learning Platform for Medical Images

eess.IV · 2024-08-18 · unverdicted · novelty 4.0

Flemme is a modular platform separating encoders (conv/transformer/SSM) from encoder-decoder architectures for medical images, with a hierarchical pyramid loss yielding reported average gains of 5.6% Dice and 5.57% PSNR.

Multimodal Sentiment Analysis with Missing Modality: A Knowledge-Transfer Approach

cs.SD · 2023-12-28 · unverdicted · novelty 4.0

A knowledge-transfer network reconstructs missing audio features and uses cross-modality attention to improve multimodal sentiment analysis, showing gains over baselines on three datasets.

A Heavy-Load-Enhanced and Changeable-Periodicity-Perceived Workload Prediction Network

cs.DC · 2023-07-11 · unverdicted · novelty 4.0

PePNet combines an adaptive periodicity detector with an Achilles' Heel loss to raise heavy-workload prediction accuracy by 21% on real cloud traces.

Lightweight Spatio-Temporal Attention Network with Graph Embedding and Rotational Position Encoding for Traffic Forecasting

cs.AI · 2025-05-17 · unverdicted · novelty 3.0

LSTAN-GERPE uses spatio-temporal attention, graph embedding, and grid-searched rotational position encoding to achieve advanced accuracy on PeMS04 and PeMS08 traffic forecasting datasets without heavy feature engineering.

A Survey on Efficient Inference for Large Language Models

cs.CL · 2024-04-22 · accept · novelty 3.0

The paper surveys techniques to speed up and reduce the resource needs of LLM inference, organized by data-level, model-level, and system-level changes, with comparative experiments on representative methods.

NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting)

cs.CV · 2022-10-01 · unverdicted · novelty 2.0

A literature survey of NeRF and neural field methods from 2020-2025, organized by architecture and application taxonomies with benchmarks and dataset overviews, covering both pre- and post-Gaussian Splatting periods.

citing papers explorer

Showing 18 of 18 citing papers.

Toward Generalizable Forgery Detection and Reasoning cs.CV · 2025-03-27 · unverdicted · none · ref 79
FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.
Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation cs.LG · 2026-03-29 · unverdicted · none · ref 28
DBR-AF decouples cross-variable correlations in reconstruction and applies autoregressive flows to model residual densities for improved anomaly detection in multivariate time series.
Versatile yet Efficient Network Traffic Analysis: Offloading Network Foundation Model to SmartNIC cs.NI · 2025-08-04 · unverdicted · none · ref 22
Nepco offloads network foundation models to SmartNICs using localized byte-sequence modeling and a pattern-aware convolutional architecture to achieve competitive macro F1 scores with 328x lower end-to-end latency than prior foundation models.
AutoPV: Automatically Design Your Photovoltaic Power Forecasting Model cs.LG · 2024-08-01 · unverdicted · none · ref 9
AutoPV applies neural architecture search with a custom search space drawn from time series forecasting and photovoltaic models to automatically produce architectures that outperform predefined state-of-the-art models on a Chinese solar station dataset.
Robust Adaptation of Foundation Models with Black-Box Visual Prompting cs.CV · 2024-07-04 · unverdicted · none · ref 21
BlackVIP adapts foundation models via a Coordinator for input-dependent visual prompts and SPSA-GC for gradient estimation, enabling robust transfer on 19 datasets with low memory use and a link to randomized smoothing robustness.
Attention-Based Deep Reinforcement Learning for Qubit Allocation in Modular Quantum Architectures quant-ph · 2024-06-17 · unverdicted · none · ref 49
An attention-based DRL agent with Transformer encoder and GNN learns heuristics for qubit-to-core allocation in multi-core quantum systems to minimize state transfers and online compilation time.
Diff-PCR: Diffusion-Based Correspondence Searching in Doubly Stochastic Matrix Space for Point Cloud Registration cs.CV · 2023-12-31 · unverdicted · none · ref 52
Diff-PCR uses a diffusion model to learn denoising directions for refining doubly stochastic correspondence matrices, improving point cloud registration over one-shot normalization methods.
netFound: Principled Design for Network Foundation Models cs.NI · 2023-10-25 · unverdicted · none · ref 56
netFound is a pretrained network foundation model using protocol-aware tokenization, context embedding, hierarchical attention, and privacy design that reaches F1 0.95 on exogenous context discrimination versus under 0.62 for prior models.
Learning Project-wise Subsequent Code Edits via Interleaving Neural-based Induction and Tool-based Deduction cs.SE · 2026-04-14 · unverdicted · none · ref 7
TRACE improves project-wise subsequent code editing by interleaving neural-based induction for semantic edits and tool-based deduction for syntactic edits.
SMFD-UNet: Semantic Face Mask Is The Only Thing You Need To Deblur Faces cs.CV · 2026-04-08 · unverdicted · none · ref 23
SMFD-UNet deblurs faces by generating semantic component masks from blurry inputs and fusing them via multi-stage UNet with residual dense blocks and attention, reporting higher PSNR and SSIM on CelebA than prior models.
RouteFormer: A Transformer-Based Routing Framework for Autonomous Vehicles cs.RO · 2025-04-07 · unverdicted · none · ref 11
RouteFormer is a transformer-RL hybrid for single-agent graph routing that reports 10% and 7% shorter distances than Concorde and LKH-3 on mission-like graphs by incorporating constraints the solvers ignore.
Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning cs.RO · 2024-12-27 · unverdicted · none · ref 41
A centralized HRL planner with HTAN, multi-stage curricula, and counterfactual baseline scales multi-robot task planning to 200 robots and 1000 racks on unlearned maps in RMFS.
Flemme: A Flexible and Modular Learning Platform for Medical Images eess.IV · 2024-08-18 · unverdicted · none · ref 4
Flemme is a modular platform separating encoders (conv/transformer/SSM) from encoder-decoder architectures for medical images, with a hierarchical pyramid loss yielding reported average gains of 5.6% Dice and 5.57% PSNR.
Multimodal Sentiment Analysis with Missing Modality: A Knowledge-Transfer Approach cs.SD · 2023-12-28 · unverdicted · none · ref 6
A knowledge-transfer network reconstructs missing audio features and uses cross-modality attention to improve multimodal sentiment analysis, showing gains over baselines on three datasets.
A Heavy-Load-Enhanced and Changeable-Periodicity-Perceived Workload Prediction Network cs.DC · 2023-07-11 · unverdicted · none · ref 19
PePNet combines an adaptive periodicity detector with an Achilles' Heel loss to raise heavy-workload prediction accuracy by 21% on real cloud traces.
Lightweight Spatio-Temporal Attention Network with Graph Embedding and Rotational Position Encoding for Traffic Forecasting cs.AI · 2025-05-17 · unverdicted · none · ref 7
LSTAN-GERPE uses spatio-temporal attention, graph embedding, and grid-searched rotational position encoding to achieve advanced accuracy on PeMS04 and PeMS08 traffic forecasting datasets without heavy feature engineering.
A Survey on Efficient Inference for Large Language Models cs.CL · 2024-04-22 · accept · none · ref 26
The paper surveys techniques to speed up and reduce the resource needs of LLM inference, organized by data-level, model-level, and system-level changes, with comparative experiments on representative methods.
NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting) cs.CV · 2022-10-01 · unverdicted · none · ref 117
A literature survey of NeRF and neural field methods from 2020-2025, organized by architecture and application taxonomies with benchmarks and dataset overviews, covering both pre- and post-Gaussian Splatting periods.

Attention is all you need

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer