{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:AXIPP6QRGN2PCEU3AQ4WJC74VL","short_pith_number":"pith:AXIPP6QR","schema_version":"1.0","canonical_sha256":"05d0f7fa113374f1129b0439648bfcaae7b602b1974251a5a0f985d934deea49","source":{"kind":"arxiv","id":"2604.18747","version":2},"attestation_state":"computed","paper":{"title":"URoPE: Universal Relative Position Embedding across Geometric Spaces","license":"http://creativecommons.org/licenses/by/4.0/","headline":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Chensheng Peng, Depu Meng, Masayoshi Tomizuka, Quentin Herau, Wei Zhan, Yichen Xie, Yihan Hu","submitted_at":"2026-04-20T18:52:03Z","abstract_excerpt":"Relative position embedding has become a standard mechanism for encoding positional information in Transformers. However, existing formulations are typically limited to a fixed geometric space, namely 1D sequences or regular 2D/3D grids, which restricts their applicability to many computer vision tasks that require geometric reasoning across camera views or between 2D and 3D spaces. To address this limitation, we propose URoPE, a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces. For each key/value image patch, URoPE samples 3D points a"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2604.18747","kind":"arxiv","version":2},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.CV","submitted_at":"2026-04-20T18:52:03Z","cross_cats_sorted":[],"title_canon_sha256":"ac811ee39c2d11d37c8df1fbf1c6b85e316264bd1a6be020ad277916dcc1fa9c","abstract_canon_sha256":"7bf1fa8a0642edc119c7bd7028807ec715e2adf6517332ded1da80ea69ebdcd5"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-07-01T00:17:11.454869Z","signature_b64":"UWfqaZzKRz4AECRlBK3NgEdGI78l10fPpxfu2VetJQbf16B8erxFWnp0RKgmwsA055CQygzU3zhN4Wai6/MoAA==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"05d0f7fa113374f1129b0439648bfcaae7b602b1974251a5a0f985d934deea49","last_reissued_at":"2026-07-01T00:17:11.454292Z","signature_status":"signed_v1","first_computed_at":"2026-07-01T00:17:11.454292Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"URoPE: Universal Relative Position Embedding across Geometric Spaces","license":"http://creativecommons.org/licenses/by/4.0/","headline":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Chensheng Peng, Depu Meng, Masayoshi Tomizuka, Quentin Herau, Wei Zhan, Yichen Xie, Yihan Hu","submitted_at":"2026-04-20T18:52:03Z","abstract_excerpt":"Relative position embedding has become a standard mechanism for encoding positional information in Transformers. However, existing formulations are typically limited to a fixed geometric space, namely 1D sequences or regular 2D/3D grids, which restricts their applicability to many computer vision tasks that require geometric reasoning across camera views or between 2D and 3D spaces. To address this limitation, we propose URoPE, a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces. For each key/value image patch, URoPE samples 3D points a"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"URoPE is a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces... URoPE is a parameter-free and intrinsics-aware relative position embedding that is invariant to the choice of global coordinate systems, while remaining fully compatible with existing RoPE-optimized attention kernels. Experiments show that URoPE consistently improves the performance of transformer-based models across all tasks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That sampling 3D points along camera rays at a small number of predefined depth anchors and projecting them into the query plane is sufficient to encode the necessary relative geometric relationships for effective cross-view and cross-dimensional reasoning.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"14de58701c0030b32711e619c017412fa86a2e2832c9b57b3e49998c3c25240b"},"source":{"id":"2604.18747","kind":"arxiv","version":2},"verdict":{"id":"2ed1fe89-b146-485d-84de-51e0c135eacf","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-10T04:13:18.206709Z","strongest_claim":"URoPE is a universal extension of Rotary Position Embedding (RoPE) to cross-view or cross-dimensional geometric spaces... URoPE is a parameter-free and intrinsics-aware relative position embedding that is invariant to the choice of global coordinate systems, while remaining fully compatible with existing RoPE-optimized attention kernels. Experiments show that URoPE consistently improves the performance of transformer-based models across all tasks.","one_line_summary":"URoPE is a parameter-free relative position embedding for transformers that works across arbitrary geometric spaces by ray sampling and projection, yielding consistent gains on novel view synthesis, 3D detection, tracking, and depth estimation.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That sampling 3D points along camera rays at a small number of predefined depth anchors and projecting them into the query plane is sufficient to encode the necessary relative geometric relationships for effective cross-view and cross-dimensional reasoning.","pith_extraction_headline":"URoPE extends rotary position embeddings to cross-view and cross-dimensional geometry by sampling and projecting 3D ray points."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.18747/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_compliance","ran_at":"2026-05-20T03:43:52.862847Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"948160e13c971439a9c1810af585f7ffd3dbe414df3c05ca526fdb3e33cdb251"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2604.18747","created_at":"2026-07-01T00:17:11.454369+00:00"},{"alias_kind":"arxiv_version","alias_value":"2604.18747v2","created_at":"2026-07-01T00:17:11.454369+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2604.18747","created_at":"2026-07-01T00:17:11.454369+00:00"},{"alias_kind":"pith_short_12","alias_value":"AXIPP6QRGN2P","created_at":"2026-07-01T00:17:11.454369+00:00"},{"alias_kind":"pith_short_16","alias_value":"AXIPP6QRGN2PCEU3","created_at":"2026-07-01T00:17:11.454369+00:00"},{"alias_kind":"pith_short_8","alias_value":"AXIPP6QR","created_at":"2026-07-01T00:17:11.454369+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":1,"internal_anchor_count":1,"sample":[{"citing_arxiv_id":"2606.31585","citing_title":"DPPE: Rethinking Camera-Based Positional Encoding for Scaling Multi-View Transformers","ref_index":23,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL","json":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL.json","graph_json":"https://pith.science/api/pith-number/AXIPP6QRGN2PCEU3AQ4WJC74VL/graph.json","events_json":"https://pith.science/api/pith-number/AXIPP6QRGN2PCEU3AQ4WJC74VL/events.json","paper":"https://pith.science/paper/AXIPP6QR"},"agent_actions":{"view_html":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL","download_json":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL.json","view_paper":"https://pith.science/paper/AXIPP6QR","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2604.18747&json=true","fetch_graph":"https://pith.science/api/pith-number/AXIPP6QRGN2PCEU3AQ4WJC74VL/graph.json","fetch_events":"https://pith.science/api/pith-number/AXIPP6QRGN2PCEU3AQ4WJC74VL/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL/action/timestamp_anchor","attest_storage":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL/action/storage_attestation","attest_author":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL/action/author_attestation","sign_citation":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL/action/citation_signature","submit_replication":"https://pith.science/pith/AXIPP6QRGN2PCEU3AQ4WJC74VL/action/replication_record"}},"created_at":"2026-07-01T00:17:11.454369+00:00","updated_at":"2026-07-01T00:17:11.454369+00:00"}