{"paper":{"title":"URoPE: Universal Relative Position Embedding across Geometric Spaces","license":"http://creativecommons.org/licenses/by/4.0/","headline":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Chensheng Peng, Depu Meng, Masayoshi Tomizuka, Quentin Herau, Wei Zhan, Yichen Xie, Yihan Hu","submitted_at":"2026-04-20T18:52:03Z","abstract_excerpt":"Relative position embedding has become a standard mechanism for encoding positional information in Transformers. However, existing formulations are typically limited to a fixed geometric space, namely 1D sequences or regular 2D/3D grids, which restricts their applicability to many computer vision tasks that require geometric reasoning across camera views or between 2D and 3D spaces. To address this limitation, we propose URoPE, a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces. For each key/value image patch, URoPE samples 3D points a"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"URoPE is a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces... URoPE is a parameter-free and intrinsics-aware relative position embedding that is invariant to the choice of global coordinate systems, while remaining fully compatible with existing RoPE-optimized attention kernels. Experiments show that URoPE consistently improves the performance of transformer-based models across all tasks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That sampling 3D points along camera rays at a small number of predefined depth anchors and projecting them into the query plane is sufficient to encode the necessary relative geometric relationships for effective cross-view and cross-dimensional reasoning.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"14de58701c0030b32711e619c017412fa86a2e2832c9b57b3e49998c3c25240b"},"source":{"id":"2604.18747","kind":"arxiv","version":2},"verdict":{"id":"2ed1fe89-b146-485d-84de-51e0c135eacf","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-10T04:13:18.206709Z","strongest_claim":"URoPE is a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces... URoPE is a parameter-free and intrinsics-aware relative position embedding that is invariant to the choice of global coordinate systems, while remaining fully compatible with existing RoPE-optimized attention kernels. Experiments show that URoPE consistently improves the performance of transformer-based models across all tasks.","one_line_summary":"URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That sampling 3D points along camera rays at a small number of predefined depth anchors and projecting them into the query plane is sufficient to encode the necessary relative geometric relationships for effective cross-view and cross-dimensional reasoning.","pith_extraction_headline":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.18747/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_compliance","ran_at":"2026-05-20T03:43:52.862847Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"948160e13c971439a9c1810af585f7ffd3dbe414df3c05ca526fdb3e33cdb251"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}