A transformer policy distilled from a privileged RL teacher enables 3.1x faster real-world cube rotation on the ORCA hand using solely joint sensor data by extracting implicit object state from temporal joint patterns.
Attention is all you need
9 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.
WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
MV-Gate improves insider threat detection on CERT and ADFA datasets by fusing multi-view statistical signals (recurrence, frequency deviation) with semantic encoders via gating.
DualViewMapDet fuses prior-traversal point cloud maps into camera features via dual perspective-view and bird's-eye-view encoding to improve 3D detection and tracking without LiDAR.
LAKER learns a data-dependent preconditioner to reduce condition numbers by up to three orders of magnitude and accelerate convergence over twenty-fold for regularized attention kernel regression in spectrum cartography.
FLARE introduces a fixed-length 3D dense fingerprint descriptor integrated with pose-based alignment and ridge enhancement for robust cross-modality matching.
A supervised transformer regression model predicts DSO responses to P2P trades on a modified IEEE 33-bus system, enabling local feasibility checks and improved market efficiency.
TGSN reports 97.78% accuracy on AD/FTD classification and RMSE of 1.93 for MMSE prediction on the XY02 EEG dataset, outperforming baselines by large margins.
citing papers explorer
-
Learning Robust Dexterous In-Hand Manipulation from Joint Sensors with Proprioceptive Transformer
A transformer policy distilled from a privileged RL teacher enables 3.1x faster real-world cube rotation on the ORCA hand using solely joint sensor data by extracting implicit object state from temporal joint patterns.
-
Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling
Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.
-
Weierstrass Positional Encoding for Vision Transformers
WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
-
MV-Gate: Insider Threat Detection via Multi-View Behavioral Statistics and Semantic Modeling
MV-Gate improves insider threat detection on CERT and ADFA datasets by fusing multi-view statistical signals (recurrence, frequency deviation) with semantic encoders via gating.
-
Leveraging Previous-Traversal Point Cloud Map Priors for Camera-Based 3D Object Detection and Tracking
DualViewMapDet fuses prior-traversal point cloud maps into camera features via dual perspective-view and bird's-eye-view encoding to improve 3D detection and tracking without LiDAR.
-
Accelerating Regularized Attention Kernel Regression for Spectrum Cartography
LAKER learns a data-dependent preconditioner to reduce condition numbers by up to three orders of magnitude and accelerate convergence over twenty-fold for regularized attention kernel regression in spectrum cartography.
-
Fixed-Length Dense Fingerprint Representation with Alignment and Robust Enhancement
FLARE introduces a fixed-length 3D dense fingerprint descriptor integrated with pose-based alignment and ridge enhancement for robust cross-modality matching.
-
Grid-Aware Peer-to-Peer Energy Trading: A Learning-Augmented Framework
A supervised transformer regression model predicts DSO responses to P2P trades on a modified IEEE 33-bus system, enabling local feasibility checks and improved market efficiency.
-
Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction
TGSN reports 97.78% accuracy on AD/FTD classification and RMSE of 1.93 for MMSE prediction on the XY02 EEG dataset, outperforming baselines by large margins.