Attention is all you need

· 2017

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

browse 9 citing papers

representative citing papers

Learning Robust Dexterous In-Hand Manipulation from Joint Sensors with Proprioceptive Transformer

cs.RO · 2026-05-20 · conditional · novelty 7.0

A transformer policy distilled from a privileged RL teacher enables 3.1x faster real-world cube rotation on the ORCA hand using solely joint sensor data by extracting implicit object state from temporal joint patterns.

Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling

cs.LG · 2026-04-11 · unverdicted · novelty 7.0

Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.

Weierstrass Positional Encoding for Vision Transformers

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.

MV-Gate: Insider Threat Detection via Multi-View Behavioral Statistics and Semantic Modeling

cs.SI · 2026-05-18 · unverdicted · novelty 6.0

MV-Gate improves insider threat detection on CERT and ADFA datasets by fusing multi-view statistical signals (recurrence, frequency deviation) with semantic encoders via gating.

Leveraging Previous-Traversal Point Cloud Map Priors for Camera-Based 3D Object Detection and Tracking

cs.CV · 2026-04-28 · unverdicted · novelty 6.0

DualViewMapDet fuses prior-traversal point cloud maps into camera features via dual perspective-view and bird's-eye-view encoding to improve 3D detection and tracking without LiDAR.

Accelerating Regularized Attention Kernel Regression for Spectrum Cartography

math.OC · 2026-04-28 · unverdicted · novelty 6.0

LAKER learns a data-dependent preconditioner to reduce condition numbers by up to three orders of magnitude and accelerate convergence over twenty-fold for regularized attention kernel regression in spectrum cartography.

Fixed-Length Dense Fingerprint Representation with Alignment and Robust Enhancement

cs.CV · 2025-05-06 · unverdicted · novelty 6.0

FLARE introduces a fixed-length 3D dense fingerprint descriptor integrated with pose-based alignment and ridge enhancement for robust cross-modality matching.

Grid-Aware Peer-to-Peer Energy Trading: A Learning-Augmented Framework

eess.SY · 2026-05-20 · unverdicted · novelty 4.0

A supervised transformer regression model predicts DSO responses to P2P trades on a modified IEEE 33-bus system, enabling local feasibility checks and improved market efficiency.

Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction

cs.LG · 2026-04-27 · unverdicted · novelty 4.0

TGSN reports 97.78% accuracy on AD/FTD classification and RMSE of 1.93 for MMSE prediction on the XY02 EEG dataset, outperforming baselines by large margins.

citing papers explorer

Showing 9 of 9 citing papers.

Learning Robust Dexterous In-Hand Manipulation from Joint Sensors with Proprioceptive Transformer cs.RO · 2026-05-20 · conditional · none · ref 23
A transformer policy distilled from a privileged RL teacher enables 3.1x faster real-world cube rotation on the ORCA hand using solely joint sensor data by extracting implicit object state from temporal joint patterns.
Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling cs.LG · 2026-04-11 · unverdicted · none · ref 17
Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.
Weierstrass Positional Encoding for Vision Transformers cs.CV · 2026-05-20 · unverdicted · none · ref 3
WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
MV-Gate: Insider Threat Detection via Multi-View Behavioral Statistics and Semantic Modeling cs.SI · 2026-05-18 · unverdicted · none · ref 3
MV-Gate improves insider threat detection on CERT and ADFA datasets by fusing multi-view statistical signals (recurrence, frequency deviation) with semantic encoders via gating.
Leveraging Previous-Traversal Point Cloud Map Priors for Camera-Based 3D Object Detection and Tracking cs.CV · 2026-04-28 · unverdicted · none · ref 35
DualViewMapDet fuses prior-traversal point cloud maps into camera features via dual perspective-view and bird's-eye-view encoding to improve 3D detection and tracking without LiDAR.
Accelerating Regularized Attention Kernel Regression for Spectrum Cartography math.OC · 2026-04-28 · unverdicted · none · ref 12
LAKER learns a data-dependent preconditioner to reduce condition numbers by up to three orders of magnitude and accelerate convergence over twenty-fold for regularized attention kernel regression in spectrum cartography.
Fixed-Length Dense Fingerprint Representation with Alignment and Robust Enhancement cs.CV · 2025-05-06 · unverdicted · none · ref 50
FLARE introduces a fixed-length 3D dense fingerprint descriptor integrated with pose-based alignment and ridge enhancement for robust cross-modality matching.
Grid-Aware Peer-to-Peer Energy Trading: A Learning-Augmented Framework eess.SY · 2026-05-20 · unverdicted · none · ref 26
A supervised transformer regression model predicts DSO responses to P2P trades on a modified IEEE 33-bus system, enabling local feasibility checks and improved market efficiency.
Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction cs.LG · 2026-04-27 · unverdicted · none · ref 43
TGSN reports 97.78% accuracy on AD/FTD classification and RMSE of 1.93 for MMSE prediction on the XY02 EEG dataset, outperforming baselines by large margins.

Attention is all you need

fields

years

verdicts

representative citing papers

citing papers explorer