Recognition: 2 theorem links
· Lean TheoremLet Geometry GUIDE: Layer-wise Unrolling of Geometric Priors in Multimodal LLMs
Pith reviewed 2026-05-10 19:28 UTC · model grok-4.3
The pith
Injecting multi-level 3D geometric features step-by-step into the early layers of multimodal LLMs lets the model learn the 2D-to-3D transition progressively and improves spatial reasoning.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GUIDE performs multi-level sampling inside the geometric encoder to capture features from local edges to global topologies, then aligns and fuses these priors step-by-step with the early layers of the MLLM while using a context-aware gate to fetch only the needed spatial cues; this design guides the model to learn the 2D-to-3D transitional process without losing local details or introducing semantic mismatches.
What carries the argument
The GUIDE framework: multi-level sampling from the geometric encoder followed by step-by-step alignment and fusion with early MLLM layers plus context-aware gating.
If this is right
- The model learns the 2D-to-3D transition progressively rather than all at once.
- Spatial priors are used more efficiently because the gate suppresses redundant geometric noise.
- Performance rises on multiple complex spatial reasoning and perception tasks over single deep-layer baselines.
- The method supplies a new way to integrate 3D geometric priors into large multimodal models.
Where Pith is reading between the lines
- The same progressive injection pattern could be tested on video or 3D point-cloud inputs to extend spatial awareness to dynamic scenes.
- If the layer-wise alignment proves stable, it might reduce the need for extra spatial training data in future MLLMs.
- Neighboring problems such as depth estimation or object pose prediction inside MLLMs could adopt similar multi-granularity unrolling.
Load-bearing premise
Multi-level geometric features sampled from the encoder can be aligned and fused with early MLLM layers without creating new semantic mismatches or losing critical local details.
What would settle it
A controlled ablation that disables the step-by-step early-layer alignment and instead fuses the same geometric features only at the deepest layer, then re-runs the same spatial reasoning and perception benchmarks; if accuracy stays the same or improves, the value of progressive unrolling is falsified.
Figures
read the original abstract
Multimodal Large Language Models (MLLMs) have achieved remarkable progress in 2D visual tasks but still exhibit limited physical spatial awareness when processing real-world visual streams. Recently, feed-forward geometric foundation models, which implicitly extract geometric priors, have provided a new pathway to address this issue. However, existing geometry-aware MLLMs are predominantly constrained by the paradigm of single deep-layer extraction and input-level fusion. This flattened fusion leads to the loss of local geometric details and causes semantic mismatches in the early layers. To break this bottleneck, we propose GUIDE (Geometric Unrolling Inside MLLM Early-layers), a progressive geometric priors injection framework. GUIDE performs multi-level sampling within the geometric encoder, comprehensively capturing multi-granularity features ranging from local edges to global topologies. Subsequently, we rigorously align and fuse these multi-level geometric priors step-by-step with the early layers of the MLLM. Building upon the injection of multi-granularity geometric information, this design guides the model to progressively learn the 2D-to-3D transitional process. Furthermore, we introduce a context-aware gating that enables the model to fetch requisite spatial cues based on current semantics, thereby maximizing the utilization efficiency of spatial priors and effectively suppressing redundant geometric noise. Extensive experiments demonstrate that GUIDE significantly outperforms existing baselines on multiple complex spatial reasoning and perception tasks, establishing a novel paradigm for integrating 3D geometric priors into large models.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes GUIDE (Geometric Unrolling Inside MLLM Early-layers), a progressive injection framework that extracts multi-granularity geometric priors via multi-level sampling from a geometric encoder and fuses them step-by-step into the early layers of Multimodal Large Language Models (MLLMs), augmented by context-aware gating. This is intended to guide learning of the 2D-to-3D transition, avoid loss of local details and semantic mismatches from single deep-layer or input-level fusion, and improve performance on spatial reasoning and perception tasks.
Significance. If the empirical claims hold after detailed validation, the work would offer a coherent alternative to existing geometry-aware MLLM designs by emphasizing early-layer progressive fusion rather than flattened late-stage injection. This could meaningfully advance physical spatial awareness in MLLMs for downstream applications such as robotics and scene understanding, provided the alignment and gating mechanisms prove robust across diverse inputs.
major comments (2)
- [Abstract and §4] Abstract and §4 (Experiments): the central claim of 'significantly outperforms existing baselines on multiple complex spatial reasoning and perception tasks' is presented without any reported datasets, baseline methods, quantitative metrics, ablation studies, or error analysis. This absence prevents verification that the gains are attributable to the progressive early-layer fusion rather than implementation details or post-hoc choices.
- [§3] §3 (Method), description of multi-level sampling and step-by-step fusion: the claim that the design 'rigorously align[s] and fuse[s]' priors 'without introducing new semantic mismatches or losing critical local details' lacks a concrete mechanism (e.g., explicit alignment loss, projection layers, or similarity metrics) or proof that the context-aware gate prevents noise amplification in early layers. This is load-bearing for the 2D-to-3D transitional guidance argument.
minor comments (2)
- [Abstract] Abstract: the phrase 'feed-forward geometric foundation models' is used without a specific citation or example model; adding one would clarify the starting point for the geometric encoder.
- [§3] Notation: the term 'multi-granularity features' is repeated but never formally defined (e.g., as feature maps at specific resolutions or depths); a short definition or diagram reference would improve clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating planned revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (Experiments): the central claim of 'significantly outperforms existing baselines on multiple complex spatial reasoning and perception tasks' is presented without any reported datasets, baseline methods, quantitative metrics, ablation studies, or error analysis. This absence prevents verification that the gains are attributable to the progressive early-layer fusion rather than implementation details or post-hoc choices.
Authors: We acknowledge that the abstract summarizes results at a high level without enumerating specifics, which is standard but can limit immediate verifiability. Section 4 of the manuscript reports the full experimental details, including the datasets for spatial reasoning and perception tasks, baseline methods, quantitative metrics, ablation studies isolating the contribution of early-layer progressive fusion, and error analysis. To directly address the concern and improve accessibility, we will revise the abstract to briefly list key datasets, representative metrics, and a note on the ablation findings that attribute gains to the proposed design rather than other factors. revision: yes
-
Referee: [§3] §3 (Method), description of multi-level sampling and step-by-step fusion: the claim that the design 'rigorously align[s] and fuse[s]' priors 'without introducing new semantic mismatches or losing critical local details' lacks a concrete mechanism (e.g., explicit alignment loss, projection layers, or similarity metrics) or proof that the context-aware gate prevents noise amplification in early layers. This is load-bearing for the 2D-to-3D transitional guidance argument.
Authors: The current description in §3 outlines multi-level sampling from the geometric encoder and step-by-step fusion into early MLLM layers with context-aware gating, but we agree it would benefit from greater specificity on the alignment and noise-control mechanisms. We will expand §3 to explicitly describe the alignment process (including any projection layers and similarity metrics employed), the fusion procedure, and any supporting loss terms. We will also add analysis or empirical validation demonstrating that the gating mechanism conditions on semantics to suppress redundant noise without amplifying it in early layers, thereby supporting the 2D-to-3D guidance claim. revision: partial
Circularity Check
No significant circularity in derivation chain
full rationale
The manuscript presents GUIDE as an empirical architectural framework consisting of multi-level sampling from a geometric encoder, step-by-step alignment and fusion with early MLLM layers, and context-aware gating. No equations, first-principles derivations, or quantitative predictions appear in the provided text. Claims of improved spatial reasoning rest on experimental outcomes rather than any reduction of outputs to fitted inputs or self-referential definitions. No self-citation chains, uniqueness theorems, or ansatzes are invoked as load-bearing elements. The design choices are motivated by stated limitations of prior single-layer fusion approaches and are presented as a coherent engineering solution whose validity is tested externally via benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Feed-forward geometric foundation models implicitly extract useful geometric priors at multiple granularities.
- ad hoc to paper Progressive step-by-step fusion of multi-granularity priors guides the MLLM to learn the 2D-to-3D transitional process.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
progressively unroll multi-granularity geometric priors into early MLLM layers, guiding the model to progressively internalize the 2D-to-3D transition process
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al
-
[2]
Flamingo: a visual language model for few-shot learning.Advances in neural information processing systems35 (2022), 23716–23736
2022
-
[3]
Shuai Bai, Yuxuan Cai, Ruizhe Chen, Keqin Chen, Xionghui Chen, Zesen Cheng, Lianghao Deng, Wei Ding, Chang Gao, Chunjiang Ge, et al . 2025. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Daigang Cai, Lichen Zhao, Jing Zhang, Lu Sheng, and Dong Xu. 2022. 3djcg: A unified framework for joint dense captioning and visual grounding on 3d point clouds. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16464–16473
2022
-
[5]
Zhongang Cai, Ruisi Wang, Chenyang Gu, Fanyi Pu, Junxiang Xu, Yubo Wang, Wanqi Yin, Zhitao Yang, Chen Wei, Qingping Sun, Tongxi Zhou, Jiaqi Li, Hui En Pang, Oscar Qian, Yukun Wei, Zhiqian Lin, Xuanke Shi, Kewang Deng, Xiaoyang Han, Zukai Chen, Xiangyu Fan, Hanming Deng, Lewei Lu, Liang Pan, Bo Li, Ziwei Liu, Quan Wang, Dahua Lin, and Lei Yang. 2025. Scali...
-
[6]
Dave Zhenyu Chen, Angel X Chang, and Matthias Nießner. 2020. Scanrefer: 3d object localization in rgb-d scans using natural language. InEuropean conference on computer vision. Springer, 202–221
2020
-
[7]
Dave Zhenyu Chen, Qirui Wu, Matthias Nießner, and Angel X Chang. 2022. D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding. InECCV. 487–505
2022
- [8]
-
[9]
Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, and Tao Chen. 2024. Ll3da: Visual interactive instruction tuning for omni-3d understanding reasoning and planning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 26428–26438
2024
-
[10]
Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, and Ivan Laptev. 2022. Language conditioned spatial relation reasoning for 3d object grounding.Advances in neural information processing systems35 (2022), 20522– 20535
2022
-
[11]
Sijin Chen, Hongyuan Zhu, Xin Chen, Yinjie Lei, Gang Yu, and Tao Chen. 2023. End-to-end 3d dense captioning with vote2cap-detr. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11124–11133
2023
- [12]
-
[13]
Zhenyu Chen, Ali Gholami, Matthias Nießner, and Angel X Chang. 2021. Scan2cap: Context-aware dense captioning in rgb-d scans. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3193–3203
2021
-
[14]
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, et al . 2024. Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 24185–24198
2024
-
[15]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. 2025. Gemini 2.5: Pushing the frontier with advanced reasoning, multi- modality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[16]
Chaorui Deng, Deyao Zhu, Kunchang Li, Chenhui Gou, Feng Li, Zeyu Wang, Shu Zhong, Weihao Yu, Xiaonan Nie, Ziang Song, et al. 2025. Emerging properties in unified multimodal pretraining.arXiv preprint arXiv:2505.14683(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[17]
Zhiwen Fan, Jian Zhang, Renjie Li, Junge Zhang, Runjin Chen, Hezhen Hu, Kevin Wang, Huaizhi Qu, Dilin Wang, Zhicheng Yan, et al. 2025. Vlm-3r: Vision- language models augmented with instruction-aligned 3d reconstruction.arXiv preprint arXiv:2505.20279(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[18]
Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, and Chuang Gan. 2023. 3d-llm: Injecting the 3d world into large language models.Advances in Neural Information Processing Systems36 (2023), 20482– 20494
2023
-
[19]
Haifeng Huang, Yilun Chen, Zehan Wang, Rongjie Huang, Runsen Xu, Tai Wang, Luping Liu, Xize Cheng, Yang Zhao, Jiangmiao Pang, et al . 2024. Chat-scene: Bridging 3d scene and large language models with object identifiers.Advances in Neural Information Processing Systems37 (2024), 113991–114017
2024
-
[20]
Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, Yan Wang, Qing Li, Song-Chun Zhu, Baoxiong Jia, and Siyuan Huang. 2024. An Embodied Generalist Agent in 3D World. InForty-first International Conference on Machine Learning
2024
-
[21]
Shijia Huang, Yilun Chen, Jiaya Jia, and Liwei Wang. 2022. Multi-view transformer for 3d visual grounding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15524–15533
2022
-
[22]
Aaron Hurst, Adam Lerer, Adam P Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, et al. 2024. Gpt-4o system card.arXiv preprint arXiv:2410.21276(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[23]
Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc Le, Yun-Hsuan Sung, Zhen Li, and Tom Duerig. 2021. Scaling up visual and vision- language representation learning with noisy text supervision. InInternational conference on machine learning. PMLR, 4904–4916
2021
-
[24]
Vincent Leroy, Yohann Cabon, and Jérôme Revaud. 2024. Grounding image matching in 3d with mast3r. InEuropean conference on computer vision. Springer, 71–91
2024
- [25]
-
[26]
Hongxing Li, Dingming Li, Zixuan Wang, Yuchen Yan, Hang Wu, Wenqi Zhang, Yongliang Shen, Weiming Lu, Jun Xiao, and Yueting Zhuang. 2026. SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models. InThe Fourteenth International Conference on Learning Representations
2026
-
[27]
Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. 2023. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. InInternational conference on machine learning. PMLR, 19730–19742
2023
-
[28]
Ji Lin, Hongxu Yin, Wei Ping, Pavlo Molchanov, Mohammad Shoeybi, and Song Han. 2024. Vila: On pre-training for visual language models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 26689–26699
2024
-
[29]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual in- struction tuning.Advances in neural information processing systems36 (2023), 34892–34916
2023
-
[30]
Kun Ouyang, Yuanxin Liu, Haoning Wu, Yi Liu, Hao Zhou, Jie Zhou, Fandong Meng, and Xu Sun. 2025. SpaceR: Reinforcing MLLMs in Video Spatial Reasoning. arXiv preprint arXiv:2504.01805(2025)
work page internal anchor Pith review arXiv 2025
-
[31]
Zhangyang Qi, Zhixiong Zhang, Ye Fang, Jiaqi Wang, and Hengshuang Zhao
-
[32]
Gpt4scene: Understand 3d scenes from videos with vision-language models
Gpt4scene: Understand 3d scenes from videos with vision-language models. arXiv preprint arXiv:2501.01428(2025)
-
[33]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. InProceedings of the 38th International Con- ference on Machine Learning (Proceedings of Machi...
2021
-
[34]
Plummer, Ranjay Krishna, Kuo-Hao Zeng, and Kate Saenko
Arijit Ray, Jiafei Duan, Ellis L Brown II, Reuben Tan, Dina Bashkirova, Rose Hendrix, Kiana Ehsani, Aniruddha Kembhavi, Bryan A. Plummer, Ranjay Krishna, Kuo-Hao Zeng, and Kate Saenko. 2025. SAT: Dynamic Spatial Aptitude Training for Multimodal Language Models. InSecond Conference on Language Modeling
2025
-
[35]
Jonas Schult, Francis Engelmann, Alexander Hermans, Or Litany, Siyu Tang, and Bastian Leibe. 2023. Mask3d: Mask transformer for 3d semantic instance segmentation. In2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 8216–8223
2023
-
[36]
Aaditya Singh, Adam Fry, Adam Perelman, Adam Tart, Adi Ganesh, Ahmed El-Kishky, Aidan McLaughlin, Aiden Low, AJ Ostrow, Akhila Ananthram, et al
-
[37]
Openai gpt-5 system card.arXiv preprint arXiv:2601.03267(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, et al. 2024. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context.arXiv preprint arXiv:2403.05530(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[39]
Shengbang Tong, Ellis L Brown II, Penghao Wu, Sanghyun Woo, ADITHYA JAIRAM IYER, Sai Charitha Akula, Shusheng Yang, Jihan Yang, Manoj Middepogu, Ziteng Wang, Xichen Pan, Rob Fergus, Yann LeCun, and Saining Xie. 2024. Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs. InThe Thirty-eighth Annual Conference on Neural Information Process...
2024
-
[40]
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rup- precht, and David Novotny. 2025. Vggt: Visual geometry grounded transformer. InProceedings of the Computer Vision and Pattern Recognition Conference. 5294– 5306
2025
-
[41]
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. 2024. Dust3r: Geometric 3d vision made easy. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 20697–20709
2024
- [42]
-
[43]
Junfei Wu, Jian Guan, Kaituo Feng, Qiang Liu, Shu Wu, Liang Wang, Wei Wu, and Tieniu Tan. 2025. Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems
2025
-
[44]
Jihan Yang, Shusheng Yang, Anjali W Gupta, Rilyn Han, Li Fei-Fei, and Saining Xie
-
[45]
InProceedings of the Computer Vision and Pattern Recognition Conference
Thinking in space: How multimodal large language models see, remember, and recall spaces. InProceedings of the Computer Vision and Pattern Recognition Conference. 10632–10643
- [46]
- [47]
- [48]
-
[49]
Hanxun Yu, Wentong Li, Song Wang, Junbo Chen, and Jianke Zhu. 2025. Inst3d- lmm: Instance-aware 3d scene understanding with multi-modal instruction tun- ing. InProceedings of the Computer Vision and Pattern Recognition Conference. 14147–14157
2025
-
[50]
Jiahui Zhang, Yurui Chen, Yueming Xu, Ze Huang, Jilin Mei, Junhui Chen, Yan- peng Zhou, Yu-Jie Yuan, Xinyue Cai, Guowei Huang, Xingyue Quan, Hang Xu, and Li Zhang. 2025. From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track
2025
-
[51]
Peiyuan Zhang, Kaichen Zhang, Bo Li, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, and Ziwei Liu. 2024. Long context transfer from language to vision.arXiv preprint arXiv:2406.16852(2024)
work page internal anchor Pith review arXiv 2024
-
[52]
Yuanhan Zhang, Jinming Wu, Wei Li, Bo Li, Zejun Ma, Ziwei Liu, and Chun- yuan Li. 2024. Video Instruction Tuning With Synthetic Data.arXiv preprint arXiv:2410.02713(2024)
work page Pith review arXiv 2024
- [53]
- [54]
-
[55]
Duo Zheng, Shijia Huang, and Liwei Wang. 2025. Video-3d llm: Learning position- aware video representation for 3d scene understanding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8995–9006
2025
-
[56]
Chenming Zhu, Tai Wang, Wenwei Zhang, Jiangmiao Pang, and Xihui Liu. 2025. Llava-3d: A simple yet effective pathway to empowering lmms with 3d capabili- ties. InProceedings of the IEEE/CVF International Conference on Computer Vision. 4295–4305
2025
-
[57]
Jinguo Zhu, Weiyun Wang, Zhe Chen, Zhaoyang Liu, Shenglong Ye, Lixin Gu, Hao Tian, Yuchen Duan, Weijie Su, Jie Shao, et al. 2025. Internvl3: Exploring advanced training and test-time recipes for open-source multimodal models. arXiv preprint arXiv:2504.10479(2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
- [58]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.