Equivariant RL agent synthesizes near-optimal Clifford circuits up to 30 qubits with lower two-qubit gate counts than Qiskit baselines.
Self-attention with relative position representations
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
Progress Ratio Embeddings use a trigonometric progress-ratio signal to deliver stable length control in transformers that generalizes to unseen target lengths.
SPAN is a hierarchical attention framework that constructs multi-scale pyramid representations from single-scale patch inputs for WSI classification and segmentation while preserving spatial relationships.
ALiBi enables transformers trained on length-1024 sequences to extrapolate to length-2048 with the same perplexity as a sinusoidal model trained on 2048, while training 11% faster and using 11% less memory.
Relation-aware self-attention encodes schema structure for text-to-SQL, raising exact-match accuracy on Spider from 18.96% to 42.94%.
A-THENA improves averaged IoT intrusion detection accuracy by 3.69-6.88 percentage points over baselines on three datasets using time-aware hybrid encoding and network-specific augmentation, with near-zero false alarms and real-time deployment on Raspberry Pi Zero 2 W.
This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.
citing papers explorer
-
Equivariant Reinforcement Learning for Clifford Quantum Circuit Synthesis
Equivariant RL agent synthesizes near-optimal Clifford circuits up to 30 qubits with lower two-qubit gate counts than Qiskit baselines.
-
Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation
Progress Ratio Embeddings use a trigonometric progress-ratio signal to deliver stable length control in transformers that generalizes to unseen target lengths.
-
Learning Spatial-Preserving Hierarchical Representations for Digital Pathology
SPAN is a hierarchical attention framework that constructs multi-scale pyramid representations from single-scale patch inputs for WSI classification and segmentation while preserving spatial relationships.
-
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
ALiBi enables transformers trained on length-1024 sequences to extrapolate to length-2048 with the same perplexity as a sinusoidal model trained on 2048, while training 11% faster and using 11% less memory.
-
Encoding Database Schemas with Relation-Aware Self-Attention for Text-to-SQL Parsers
Relation-aware self-attention encodes schema structure for text-to-SQL, raising exact-match accuracy on Spider from 18.96% to 42.94%.
-
A-THENA: Early Intrusion Detection for IoT with Time-Aware Hybrid Encoding and Network-Specific Augmentation
A-THENA improves averaged IoT intrusion detection accuracy by 3.69-6.88 percentage points over baselines on three datasets using time-aware hybrid encoding and network-specific augmentation, with near-zero false alarms and real-time deployment on Raspberry Pi Zero 2 W.
-
A Survey of Large Language Models
This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.