SplatWeaver: Learning to Allocate Gaussian Primitives for Generalizable Novel View Synthesis
Pith reviewed 2026-05-22 10:40 UTC · model grok-4.3
The pith
SplatWeaver learns to assign varying numbers of 3D Gaussians to different scene regions from uncalibrated images in a single forward pass.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SplatWeaver introduces cardinality Gaussian experts and a pixel-level routing scheme that together allow the model to predict, for every spatial location, how many Gaussian primitives to instantiate, with a high-frequency prior and guidance module that stabilize the routing toward higher counts in complex regions and lower counts in smooth ones.
What carries the argument
Cardinality Gaussian experts (each producing a fixed number of primitives from 0 to M) coordinated by a pixel-level routing network, stabilized by a high-frequency prior and guidance module plus routing regularization.
If this is right
- Fewer total primitives suffice for the same or better rendering quality because capacity is concentrated where scene complexity is highest.
- Feed-forward inference becomes viable for scenes whose detail varies sharply across space without requiring later per-scene refinement.
- The same routing logic can be applied to any primitive-based scene representation whose local density can be adjusted at inference time.
Where Pith is reading between the lines
- The expert-routing pattern may transfer to other adaptive representations such as neural radiance fields or voxel grids where local resolution should vary with content.
- Real-time rendering pipelines could exploit the resulting sparsity by skipping empty or low-cardinality regions entirely during splatting.
- If the router generalizes across datasets, it could serve as a learned complexity estimator for downstream tasks such as view selection or compression of 3D captures.
Load-bearing premise
The high-frequency prior with its guidance module and routing regularization can reliably drive the router to assign more primitives to complex regions and fewer to smooth ones from uncalibrated input images without any per-scene optimization.
What would settle it
A controlled ablation in which the high-frequency guidance module is removed and the router is observed to revert to near-uniform primitive counts across textured and smooth regions on the same test scenes.
Figures
read the original abstract
Generalizable novel view synthesis aims to render unseen views from uncalibrated input images without requiring per-scene optimization. Recent feed-forward approaches based on 3D Gaussian Splatting have achieved promising efficiency and rendering quality. However, most of them assign a fixed number of Gaussians to each pixel or voxel, ignoring the spatially varying complexity of real-world scenes. Such uniform allocation often wastes Gaussian primitives in smooth regions while providing insufficient capacity for fine structures, complex geometry, and high-frequency details. This motivates us to predict region-dependent primitive cardinalities rather than impose a fixed primitive budget everywhere, enabling a more expressive 3D scene representation. Therefore, we propose SplatWeaver, a generalizable novel view synthesis framework that is able to dynamically allocate Gaussian primitives over different regions in a feed-forward manner. Specifically, SplatWeaver introduces cardinality Gaussian experts and a pixel-level routing scheme, wherein each expert specializes in producing a specific number of primitives from 0 to M, and the routing scheme coordinates these experts to adaptively determine how many Gaussian primitives should be allocated to each spatial location. Moreover, SplatWeaver incorporates a high-frequency prior with attendant guidance module and routing regularization to stabilize expert selection and promote complexity-aware allocation. By leveraging high-frequency cues, the routing process is encouraged to assign more Gaussian primitives to fine structures and textured regions, while suppressing redundancy in smooth areas. Extensive experiments across diverse scenarios show that SplatWeaver consistently outperforms state-of-the-art methods, delivering more faithful novel-view renderings with fewer Gaussian primitives. Project Page: https://yecongwan.github.io/SplatWeaver/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SplatWeaver, a feed-forward framework for generalizable novel view synthesis from uncalibrated images. It replaces uniform Gaussian primitive allocation with a dynamic scheme using cardinality Gaussian experts (each specialized for a fixed count from 0 to M) and a pixel-level router. A high-frequency prior, guidance module, and routing regularization are added to bias allocation toward textured and geometrically complex regions. The central claim is that this yields higher-fidelity novel-view renderings than prior feed-forward 3DGS methods while using fewer total primitives.
Significance. If the routing and high-frequency guidance prove stable across views, the work offers a principled way to make generalizable NVS more efficient by matching primitive density to local scene complexity rather than imposing a global budget. The expert-plus-router architecture is a clear architectural contribution that could influence subsequent feed-forward splatting pipelines.
major comments (2)
- [Abstract and §3] Abstract and §3 (method description): the claim that the high-frequency prior plus guidance module and routing regularization produce stable, complexity-aware expert selection rests on the untested assumption that 2D frequency content reliably signals view-consistent 3D geometric complexity. Occlusions, specularities, and depth discontinuities can generate misleading cues; without ablations that isolate the prior's effect on routing decisions or visualizations of per-expert activation maps on held-out views, it is unclear whether the mechanism actually prevents over- or under-allocation that only appears after novel-view rendering.
- [§4] §4 (experiments): the abstract asserts consistent outperformance with fewer primitives, yet no quantitative tables, per-scene primitive counts, or cross-method comparisons (e.g., PSNR/SSIM deltas versus fixed-budget baselines) are referenced. Without these data it is impossible to judge whether the reported gains are load-bearing for the central claim or whether the reduction in primitives comes at the cost of quality in high-complexity regions.
minor comments (1)
- [§3] Notation for the maximum primitives per expert (M) and the precise form of the routing regularization loss should be defined explicitly with equations rather than left at the level of the abstract description.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our work. Below we respond point-by-point to the major comments, clarifying our design rationale and experimental evidence while committing to targeted revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (method description): the claim that the high-frequency prior plus guidance module and routing regularization produce stable, complexity-aware expert selection rests on the untested assumption that 2D frequency content reliably signals view-consistent 3D geometric complexity. Occlusions, specularities, and depth discontinuities can generate misleading cues; without ablations that isolate the prior's effect on routing decisions or visualizations of per-expert activation maps on held-out views, it is unclear whether the mechanism actually prevents over- or under-allocation that only appears after novel-view rendering.
Authors: We agree that the link between 2D high-frequency content and view-consistent 3D complexity is an inductive bias rather than a rigorously proven mapping, and that occlusions or specularities can produce noisy cues. In §3.3 we motivate the prior by noting that high-frequency 2D regions typically correspond to geometric detail that benefits from additional primitives; §4.3 reports ablation results showing that removing the guidance module and routing regularization degrades both PSNR and the adaptivity of primitive counts. However, these ablations measure end-to-end rendering quality rather than isolating routing decisions per se. We will therefore add (i) per-expert activation visualizations on held-out views and (ii) an ablation that disables only the high-frequency prior while keeping the router intact, to directly demonstrate stability of expert selection across views. revision: yes
-
Referee: [§4] §4 (experiments): the abstract asserts consistent outperformance with fewer primitives, yet no quantitative tables, per-scene primitive counts, or cross-method comparisons (e.g., PSNR/SSIM deltas versus fixed-budget baselines) are referenced. Without these data it is impossible to judge whether the reported gains are load-bearing for the central claim or whether the reduction in primitives comes at the cost of quality in high-complexity regions.
Authors: We regret that the main-text narrative did not explicitly point readers to the supporting numbers. Table 1 already reports average PSNR/SSIM/LPIPS together with mean primitive counts for SplatWeaver versus prior feed-forward 3DGS methods; supplementary material contains per-scene breakdowns. To make the efficiency claim fully transparent, we will insert a new column in Table 1 showing PSNR/SSIM deltas relative to fixed-budget baselines (e.g., 64 or 128 primitives per pixel) and will add a short paragraph in §4.2 that quantifies quality retention in high-complexity regions (measured by local PSNR on edge/texture masks). These additions will allow direct assessment of whether dynamic allocation preserves or improves fidelity where it matters most. revision: yes
Circularity Check
No circularity detected in architectural derivation
full rationale
The paper introduces an independent neural architecture (cardinality experts + pixel routing + high-frequency guidance) trained end-to-end for feed-forward allocation. No equations or claims reduce by construction to fitted inputs, self-definitions, or self-citation chains; the high-frequency prior is an explicit design choice justified by empirical motivation rather than tautology. Central performance claims rest on external benchmark comparisons, not internal re-derivations.
Axiom & Free-Parameter Ledger
free parameters (1)
- M (maximum primitives per expert)
axioms (1)
- domain assumption High-frequency image features reliably indicate regions that require more Gaussian primitives for accurate reconstruction.
invented entities (1)
-
Cardinality Gaussian experts
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SplatWeaver introduces cardinality Gaussian experts and a pixel-level routing scheme... high-frequency prior with attendant guidance module and routing regularization... (L L, L H, HL, HH) = DWT(I), HF = (√(LH² + HL² + HH²)) ↑ 2
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
each expert specializes in producing a specific number of primitives from 0 to M
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Nerf: Representing scenes as neural radiance fields for view synthesis,
B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021
work page 2021
-
[2]
3d gaussian splatting for real-time radiance field rendering
B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering.”ACM Trans. Graph., vol. 42, no. 4, pp. 139–1, 2023
work page 2023
-
[3]
Tensorf: Tensorial radiance fields,
A. Chen, Z. Xu, A. Geiger, J. Yu, and H. Su, “Tensorf: Tensorial radiance fields,” inEuropean conference on computer vision. Springer, 2022, pp. 333–350
work page 2022
-
[4]
Plenoxels: Radiance fields without neural networks,
S. Fridovich-Keil, A. Yu, M. Tancik, Q. Chen, B. Recht, and A. Kanazawa, “Plenoxels: Radiance fields without neural networks,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5501–5510
work page 2022
-
[5]
Fastnerf: High-fidelity neural rendering at 200fps,
S. J. Garbin, M. Kowalski, M. Johnson, J. Shotton, and J. Valentin, “Fastnerf: High-fidelity neural rendering at 200fps,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 14 346–14 355
work page 2021
-
[6]
Instant neural graphics primitives with a multiresolution hash encoding,
T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,”ACM transactions on graphics (TOG), vol. 41, no. 4, pp. 1–15, 2022
work page 2022
-
[7]
Mip-splatting: Alias- free 3d gaussian splatting,
Z. Yu, A. Chen, B. Huang, T. Sattler, and A. Geiger, “Mip-splatting: Alias- free 3d gaussian splatting,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 19 447–19 456
work page 2024
-
[8]
Gaussianpro: 3d gaussian splatting with progressive propa- gation,
K. Cheng, X. Long, K. Yang, Y . Yao, W. Yin, Y . Ma, W. Wang, and X. Chen, “Gaussianpro: 3d gaussian splatting with progressive propa- gation,” inForty-first International Conference on Machine Learning, 2024
work page 2024
-
[9]
4d gaussian splatting for real-time dynamic scene rendering,
G. Wu, T. Yi, J. Fang, L. Xie, X. Zhang, W. Wei, W. Liu, Q. Tian, and X. Wang, “4d gaussian splatting for real-time dynamic scene rendering,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 20 310–20 320
work page 2024
-
[10]
Scaffold-gs: Structured 3d gaussians for view-adaptive rendering,
T. Lu, M. Yu, L. Xu, Y . Xiangli, L. Wang, D. Lin, and B. Dai, “Scaffold-gs: Structured 3d gaussians for view-adaptive rendering,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 20 654–20 664
work page 2024
-
[11]
pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction,
D. Charatan, S. L. Li, A. Tagliasacchi, and V . Sitzmann, “pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 19 457–19 467
work page 2024
-
[12]
Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images,
Y . Chen, H. Xu, C. Zheng, B. Zhuang, M. Pollefeys, A. Geiger, T.- J. Cham, and J. Cai, “Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images,” inEuropean conference on computer vision. Springer, 2024, pp. 370–386
work page 2024
-
[13]
Long-lrm: Long-sequence large reconstruction model for wide- coverage gaussian splats,
C. Ziwen, H. Tan, K. Zhang, S. Bi, F. Luan, Y . Hong, L. Fuxin, and Z. Xu, “Long-lrm: Long-sequence large reconstruction model for wide- coverage gaussian splats,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 4349–4359
work page 2025
-
[14]
Anysplat: Feed-forward 3d gaussian splatting from unconstrained views,
L. Jiang, Y . Mao, L. Xu, T. Lu, K. Ren, Y . Jin, X. Xu, M. Yu, J. Pang, F. Zhaoet al., “Anysplat: Feed-forward 3d gaussian splatting from unconstrained views,”ACM Transactions on Graphics (TOG), vol. 44, no. 6, pp. 1–16, 2025
work page 2025
-
[15]
Wavenerf: Wavelet-based generalizable neural radiance fields,
M. Xu, F. Zhan, J. Zhang, Y . Yu, X. Zhang, C. Theobalt, L. Shao, and S. Lu, “Wavenerf: Wavelet-based generalizable neural radiance fields,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 18 195–18 204
work page 2023
-
[16]
Depthsplat: Connecting gaussian splatting and depth,
H. Xu, S. Peng, F. Wang, H. Blum, D. Barath, A. Geiger, and M. Pollefeys, “Depthsplat: Connecting gaussian splatting and depth,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 16 453–16 463
work page 2025
-
[17]
Gs-lrm: Large reconstruction model for 3d gaussian splatting,
K. Zhang, S. Bi, H. Tan, Y . Xiangli, N. Zhao, K. Sunkavalli, and Z. Xu, “Gs-lrm: Large reconstruction model for 3d gaussian splatting,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 1–19
work page 2024
-
[18]
Epipolar-free 3d gaussian splatting for generalizable novel view synthesis,
Z. Min, Y . Luo, J. Sun, and Y . Yang, “Epipolar-free 3d gaussian splatting for generalizable novel view synthesis,”Advances in Neural Information Processing Systems, vol. 37, pp. 39 573–39 596, 2024. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 15
work page 2024
-
[19]
Hisplat: Hierarchical 3d gaussian splatting for generalizable sparse-view reconstruction,
S. Tang, W. Ye, P. Ye, W. Lin, Y . Zhou, T. Chen, and W. Ouyang, “Hisplat: Hierarchical 3d gaussian splatting for generalizable sparse-view reconstruction,”arXiv preprint arXiv:2410.06245, 2024
-
[20]
Zpressor: Bottleneck-aware compression for scalable feed-forward 3dgs,
W. Wang, D. Y . Chen, Z. Zhang, D. Shi, A. Liu, and B. Zhuang, “Zpressor: Bottleneck-aware compression for scalable feed-forward 3dgs,”arXiv preprint arXiv:2505.23734, 2025
-
[21]
Yonosplat: You only need one model for feedforward 3d gaussian splatting,
B. Ye, B. Chen, H. Xu, D. Barath, and M. Pollefeys, “Yonosplat: You only need one model for feedforward 3d gaussian splatting,” inInternational Conference on Learning Representations (ICLR), 2026
work page 2026
-
[22]
No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images
B. Ye, S. Liu, H. Xu, X. Li, M. Pollefeys, M.-H. Yang, and S. Peng, “No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images,”arXiv preprint arXiv:2410.24207, 2024
-
[23]
Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views,
S. Zhang, J. Wang, Y . Xu, N. Xue, C. Rupprecht, X. Zhou, Y . Shen, and G. Wetzstein, “Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 21 936– 21 947
work page 2025
-
[24]
Pf3plat: Pose-free feed-forward 3d gaussian splatting, 2025
S. Hong, J. Jung, H. Shin, J. Han, J. Yang, C. Luo, and S. Kim, “Pf3plat: Pose-free feed-forward 3d gaussian splatting,”arXiv preprint arXiv:2410.22128, 2024
-
[25]
Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs
B. Smart, C. Zheng, I. Laina, and V . A. Prisacariu, “Splatt3r: Zero- shot gaussian splatting from uncalibrated image pairs,”arXiv preprint arXiv:2408.13912, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[26]
Evolsplat: Efficient volume-based gaussian splatting for urban view synthesis,
S. Miao, J. Huang, D. Bai, X. Yan, H. Zhou, Y . Wang, B. Liu, A. Geiger, and Y . Liao, “Evolsplat: Efficient volume-based gaussian splatting for urban view synthesis,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 11 286–11 296
work page 2025
-
[27]
V olsplat: Rethinking feed-forward 3d gaussian splatting with voxel-aligned prediction,
W. Wang, Y . Chen, Z. Zhang, H. Liu, H. Wang, Z. Feng, W. Qin, Z. Zhu, D. Y . Chen, and B. Zhuang, “V olsplat: Rethinking feed-forward 3d gaussian splatting with voxel-aligned prediction,”arXiv preprint arXiv:2509.19297, 2025
-
[28]
Tokensplat: Token- aligned 3d gaussian splatting for feed-forward pose-free reconstruction,
Y . Li, C. Lv, Z. Tang, H. Yang, and D. Huang, “Tokensplat: Token- aligned 3d gaussian splatting for feed-forward pose-free reconstruction,” arXiv preprint arXiv:2603.00697, 2026
-
[29]
Worldmirror: Universal 3d world reconstruction with any-prior prompting,
Y . Liu, Z. Min, Z. Wang, J. Wu, T. Wang, Y . Yuan, Y . Luo, and C. Guo, “Worldmirror: Universal 3d world reconstruction with any-prior prompting,”arXiv preprint arXiv:2510.10726, 2025
-
[30]
S. Zhang, X. Fei, F. Liu, H. Song, and Y . Duan, “Gaussian graph network: Learning efficient and generalizable gaussian representations from multi-view images,”Advances in Neural Information Processing Systems, vol. 37, pp. 50 361–50 380, 2024
work page 2024
-
[31]
Ecosplat: Efficiency-controllable feed-forward 3d gaussian splatting from multi-view images,
J. Park, M.-Q. V . Bui, J. L. G. Bello, J. Moon, J. Oh, and M. Kim, “Ecosplat: Efficiency-controllable feed-forward 3d gaussian splatting from multi-view images,”arXiv preprint arXiv:2512.18692, 2025
-
[32]
arXiv preprint arXiv:2512.15508 (2025)
A. Moreau, R. Shaw, M. Nazarczuk, J. Shin, T. Tanay, Z. Zhang, S. Xu, and E. Pérez-Pellitero, “Off the grid: Detection of primitives for feed- forward 3d gaussian splatting,”arXiv preprint arXiv:2512.15508, 2025
-
[33]
Gaus- siantrim3r: Controllable 3d gaussians pruning for feedforward models
B. Singhal, K. Srihari, A. Dhiman, and V . B. Radhakrishnan, “Gaus- siantrim3r: Controllable 3d gaussians pruning for feedforward models.”
-
[34]
C3G: Learning Compact 3D Representations with 2K Gaussians
H. An, J. Jung, M. Kim, S. Hong, C. Kim, K. Fukuda, M. Jeon, J. Han, T. Narihira, H. Koet al., “C3g: Learning compact 3d representations with 2k gaussians,”arXiv preprint arXiv:2512.04021, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[35]
Tokengs: Decoupling 3d gaussian prediction from pixels with learnable tokens,
J. Ren, M. Tyszkiewicz, J. Huang, and Z. Gojcic, “Tokengs: Decoupling 3d gaussian prediction from pixels with learnable tokens,”Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026
work page 2026
-
[36]
Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields,
J. T. Barron, B. Mildenhall, M. Tancik, P. Hedman, R. Martin-Brualla, and P. P. Srinivasan, “Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 5855–5864
work page 2021
-
[37]
Mip-nerf 360: Unbounded anti-aliased neural radiance fields,
J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-nerf 360: Unbounded anti-aliased neural radiance fields,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 5470–5479
work page 2022
-
[38]
Zip-nerf: Anti-aliased grid-based neural radiance fields,
——, “Zip-nerf: Anti-aliased grid-based neural radiance fields,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 19 697–19 705
work page 2023
-
[39]
Ref-nerf: Structured view-dependent appearance for neural radiance fields,
D. Verbin, P. Hedman, B. Mildenhall, T. Zickler, J. T. Barron, and P. P. Srinivasan, “Ref-nerf: Structured view-dependent appearance for neural radiance fields,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022, pp. 5481–5490
work page 2022
-
[40]
Nerfies: Deformable neural radiance fields,
K. Park, U. Sinha, J. T. Barron, S. Bouaziz, D. B. Goldman, S. M. Seitz, and R. Martin-Brualla, “Nerfies: Deformable neural radiance fields,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 5865–5874
work page 2021
-
[41]
Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields,
K. Park, U. Sinha, P. Hedman, J. T. Barron, S. Bouaziz, D. B. Goldman, R. Martin-Brualla, and S. M. Seitz, “Hypernerf: A higher-dimensional representation for topologically varying neural radiance fields,”arXiv preprint arXiv:2106.13228, 2021
-
[42]
Masked space-time hash encoding for efficient dynamic scene reconstruction,
F. Wang, Z. Chen, G. Wang, Y . Song, and H. Liu, “Masked space-time hash encoding for efficient dynamic scene reconstruction,”Advances in neural information processing systems, vol. 36, pp. 70 497–70 510, 2023
work page 2023
-
[43]
Fast dynamic radiance fields with time-aware neural voxels,
J. Fang, T. Yi, X. Wang, L. Xie, X. Zhang, W. Liu, M. Nießner, and Q. Tian, “Fast dynamic radiance fields with time-aware neural voxels,” inSIGGRAPH Asia 2022 Conference Papers, 2022, pp. 1–9
work page 2022
-
[44]
Robust dynamic radiance fields,
Y .-L. Liu, C. Gao, A. Meuleman, H.-Y . Tseng, A. Saraf, C. Kim, Y .-Y . Chuang, J. Kopf, and J.-B. Huang, “Robust dynamic radiance fields,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 13–23
work page 2023
-
[45]
Forward flow for novel view synthesis of dynamic scenes,
X. Guo, J. Sun, Y . Dai, G. Chen, X. Ye, X. Tan, E. Ding, Y . Zhang, and J. Wang, “Forward flow for novel view synthesis of dynamic scenes,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 16 022–16 033
work page 2023
-
[46]
Tensor4d: Efficient neural 4d decomposition for high-fidelity dynamic reconstruction and rendering,
R. Shao, Z. Zheng, H. Tu, B. Liu, H. Zhang, and Y . Liu, “Tensor4d: Efficient neural 4d decomposition for high-fidelity dynamic reconstruction and rendering,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16 632–16 642
work page 2023
-
[47]
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
J. Tang, J. Ren, H. Zhou, Z. Liu, and G. Zeng, “Dreamgaussian: Generative gaussian splatting for efficient 3d content creation,”arXiv preprint arXiv:2309.16653, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[48]
B. Zhou, S. Zheng, H. Tu, R. Shao, B. Liu, S. Zhang, L. Nie, and Y . Liu, “Gps-gaussian+: Generalizable pixel-wise 3d gaussian splatting for real- time human-scene rendering from sparse views,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[49]
Efficient scene modeling via structure-aware and region-prioritized 3d gaussians,
G. Fang and B. Wang, “Efficient scene modeling via structure-aware and region-prioritized 3d gaussians,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[50]
Gir: 3d gaussian inverse rendering for relightable scene factorization,
Y . Shi, Y . Wu, C. Wu, X. Liu, C. Zhao, H. Feng, J. Zhang, B. Zhou, E. Ding, and J. Wang, “Gir: 3d gaussian inverse rendering for relightable scene factorization,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[51]
Stylizedgs: Controllable stylization for 3d gaussian splatting,
D. Zhang, Y .-J. Yuan, Z. Chen, F.-L. Zhang, Z. He, S. Shan, and L. Gao, “Stylizedgs: Controllable stylization for 3d gaussian splatting,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[52]
Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo,
A. Chen, Z. Xu, F. Zhao, X. Zhang, F. Xiang, J. Yu, and H. Su, “Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 14 124–14 133
work page 2021
-
[53]
Is attention all that nerf needs?
P. Wang, X. Chen, T. Chen, S. Venugopalan, Z. Wanget al., “Is attention all that nerf needs?”arXiv preprint arXiv:2207.13298, 2022
-
[54]
Skipnet: Learning dynamic routing in convolutional networks,
X. Wang, F. Yu, Z.-Y . Dou, T. Darrell, and J. E. Gonzalez, “Skipnet: Learning dynamic routing in convolutional networks,” inProceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 409–424
work page 2018
-
[55]
Convolutional networks with adaptive inference graphs,
A. Veit and S. Belongie, “Convolutional networks with adaptive inference graphs,” inProceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 3–18
work page 2018
-
[56]
X. Jia, B. De Brabandere, T. Tuytelaars, and L. V . Gool, “Dynamic filter networks,”Advances in neural information processing systems, vol. 29, 2016
work page 2016
-
[57]
Deformable convolutional networks,
J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei, “Deformable convolutional networks,” inProceedings of the IEEE international conference on computer vision, 2017, pp. 764–773
work page 2017
-
[58]
Spatio-temporal filter adaptive network for video deblurring,
S. Zhou, J. Zhang, J. Pan, H. Xie, W. Zuo, and J. Ren, “Spatio-temporal filter adaptive network for video deblurring,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2482–2491
work page 2019
-
[59]
Deformable kernels: Adapt- ing effective receptive fields for object deformation,
H. Gao, X. Zhu, S. Lin, and J. Dai, “Deformable kernels: Adapt- ing effective receptive fields for object deformation,”arXiv preprint arXiv:1910.02940, 2019
-
[60]
Y .-C. Su and K. Grauman, “Leaving some stones unturned: dynamic feature prioritization for activity detection in streaming video,” in European Conference on Computer Vision. Springer, 2016, pp. 783–800
work page 2016
-
[61]
Adaframe: Adaptive frame selection for fast video recognition,
Z. Wu, C. Xiong, C.-Y . Ma, R. Socher, and L. S. Davis, “Adaframe: Adaptive frame selection for fast video recognition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 1278–1287
work page 2019
-
[62]
Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation
Y . Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,”arXiv preprint arXiv:1308.3432, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[63]
From sparse to soft mixtures of experts
J. Puigcerver, C. Riquelme, B. Mustafa, and N. Houlsby, “From sparse to soft mixtures of experts,”arXiv preprint arXiv:2308.00951, 2023. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 16
-
[64]
Scaling vision with sparse mixture of experts,
C. Riquelme, J. Puigcerver, B. Mustafa, M. Neumann, R. Jenatton, A. Susano Pinto, D. Keysers, and N. Houlsby, “Scaling vision with sparse mixture of experts,”Advances in Neural Information Processing Systems, vol. 34, pp. 8583–8595, 2021
work page 2021
-
[65]
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. Le, G. Hinton, and J. Dean, “Outrageously large neural networks: The sparsely-gated mixture-of-experts layer,”arXiv preprint arXiv:1701.06538, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[66]
Uni-moe: Scaling unified multimodal llms with mixture of experts,
Y . Li, S. Jiang, B. Hu, L. Wang, W. Zhong, W. Luo, L. Ma, and M. Zhang, “Uni-moe: Scaling unified multimodal llms with mixture of experts,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[67]
Mome: Mixture of multimodal experts for generalist multimodal large language models,
L. Shen, G. Chen, R. Shao, W. Guan, and L. Nie, “Mome: Mixture of multimodal experts for generalist multimodal large language models,” Advances in neural information processing systems, vol. 37, pp. 42 048– 42 070, 2024
work page 2024
-
[68]
J. Wei, X. Zhao, J. Woo, J. Ouyang, G. El Fakhri, Q. Chen, and X. Liu, “Mixture-of-shape-experts (mose): End-to-end shape dictionary framework to prompt sam for generalizable medical segmentation,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 6448–6458
work page 2025
-
[69]
G. Wang, J. Ye, J. Cheng, T. Li, Z. Chen, J. Cai, J. He, and B. Zhuang, “Sam-med3d-moe: Towards a non-forgetting segment anything model via mixture of experts for 3d medical image segmentation,” inInterna- tional Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2024, pp. 552–561
work page 2024
-
[70]
Complexity experts are task-discriminative learners for any image restoration,
E. Zamfir, Z. Wu, N. Mehta, Y . Tan, D. P. Paudel, Y . Zhang, and R. Timofte, “Complexity experts are task-discriminative learners for any image restoration,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 12 753–12 763
work page 2025
-
[71]
J. Lin, Z. Zhang, W. Li, R. Pei, H. Xu, H. Zhang, and W. Zuo, “Unirestorer: Universal image restoration via adaptively estimating image degradation at proper granularity,”arXiv preprint arXiv:2412.20157, 2024
-
[72]
DINOv2: Learning Robust Visual Features without Supervision
M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Noubyet al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint arXiv:2304.07193, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[73]
Vggt: Visual geometry grounded transformer,
J. Wang, M. Chen, N. Karaev, A. Vedaldi, C. Rupprecht, and D. Novotny, “Vggt: Visual geometry grounded transformer,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2025, pp. 5294–5306
work page 2025
-
[74]
Vision transformers for dense prediction,
R. Ranftl, A. Bochkovskiy, and V . Koltun, “Vision transformers for dense prediction,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 12 179–12 188
work page 2021
-
[75]
Billion-scale similarity search with gpus,
J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with gpus,”IEEE Transactions on Big Data, 2019
work page 2019
-
[76]
Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision,
L. Ling, Y . Sheng, Z. Tu, W. Zhao, C. Xin, K. Wan, L. Yu, Q. Guo, Z. Yu, Y . Luet al., “Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 22 160–22 169
work page 2024
-
[77]
Stereo Magnification: Learning View Synthesis using Multiplane Images
T. Zhou, R. Tucker, J. Flynn, G. Fyffe, and N. Snavely, “Stereo magnification: Learning view synthesis using multiplane images,”arXiv preprint arXiv:1805.09817, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[78]
No pose at all: Self-supervised pose-free 3d gaussian splatting from sparse views,
R. Huang and K. Mikolajczyk, “No pose at all: Self-supervised pose-free 3d gaussian splatting from sparse views,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 27 947–27 957
work page 2025
-
[79]
Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,
Z. Fan, K. Wang, K. Wen, Z. Zhu, D. Xu, and Z. Wang, “Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps,” Advances in neural information processing systems, vol. 37, pp. 140 138– 140 158, 2024
work page 2024
-
[80]
H. Zhao, L. Jiang, J. Jia, P. H. Torr, and V . Koltun, “Point transformer,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 16 259–16 268
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.